Bug 258706

Summary: JS markdown parser performs 50x slower in JSC compared to V8, likely due to regex
Product: WebKit Reporter: Jarred Sumner <jarred>
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned>
Status: NEW ---    
Severity: Normal CC: john.david.dalton, mark.lam, msaboff, rniwa, webkit-bug-importer, ysuzuki
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 271617    
Bug Blocks:    
Attachments:
Description Flags
Reproduction that jsc shell can run
none
Instruments profile of "vite dev" none

Description Jarred Sumner 2023-06-29 19:02:09 PDT
Created attachment 466880 [details]
Reproduction that jsc shell can run

I've attached a repro that can be executed in jsc shell directly. It prints out the length of the output file after parsing it.

❯ hyperfine "node out.mjs" "jsc -m out.mjs"
Benchmark 1: node out.mjs
  Time (mean ± σ):     123.8 ms ±  21.1 ms    [User: 94.7 ms, System: 11.0 ms]
  Range (min … max):    77.9 ms … 153.5 ms    16 runs
 
Benchmark 2: jsc -m out.mjs
  Time (mean ± σ):      6.803 s ±  0.211 s    [User: 2.511 s, System: 0.025 s]
  Range (min … max):    6.507 s …  7.196 s    10 runs
 
Summary
  'node out.mjs' ran
   54.95 ± 9.53 times faster than 'jsc -m out.mjs'

If we profile it with Instruments, we can see that majority of time is spent in JSC::Yarr

2.16 s   97.6%	2.15 s	 	 JSC::Yarr::Interpreter<char16_t>::matchDisjunction(JSC::Yarr::ByteDisjunction*, JSC::Yarr::Interpreter<char16_t>::DisjunctionContext*, bool)
3.00 ms    0.1%	3.00 ms	 	 void std::__1::__introsort<std::__1::_ClassicAlgPolicy, JSC::ARM64Assembler::jumpsToLink()::'lambda'(auto&, auto&)&, JSC::ARM64Assembler::LinkRecord*>(JSC::ARM64Assembler::LinkRecord*, JSC::ARM64Assembler::LinkRecord*, auto, std::__1::iterator_traits<JSC::ARM64Assembler::LinkRecord*>::difference_type)
2.00 ms    0.0%	2.00 ms	 	 JSC::BytecodeGenerator::variable(JSC::Identifier const&, JSC::ThisResolutionType)
5.00 ms    0.2%	2.00 ms	 	 void JSC::LinkBuffer::copyCompactAndLinkCode<unsigned int>(JSC::MacroAssembler&, JSC::JITCompilationEffort)
1.00 ms    0.0%	1.00 ms	 	 sys_icache_invalidate
3.00 ms    0.1%	1.00 ms	 	 JSC::ASTBuilder::SourceElements JSC::Parser<JSC::Lexer<unsigned char>>::parseModuleSourceElements<JSC::ASTBuilder>(JSC::ASTBuilder&)
1.00 ms    0.0%	1.00 ms	 	 ash_map.HashMapUnmanaged([]const u8,void,src.bun.StringHashMapContext,80).growIfNeeded
1.00 ms    0.0%	1.00 ms	 	 _platform_memset
1.00 ms    0.0%	1.00 ms	 	 __bzero
1.00 ms    0.0%	1.00 ms	 	 JSC::DFG::DefMethodClobberize<JSC::DFG::(anonymous namespace)::LocalCSEPhase::BlockCSE<JSC::DFG::(anonymous namespace)::LocalCSEPhase::LargeMaps>>::operator()(JSC::DFG::PureValue) const


https://github.com/oven-sh/bun/assets/709451/f12bdd48-4bbe-4dd7-83f8-94d7dd920204
Comment 1 Radar WebKit Bug Importer 2023-07-06 19:03:17 PDT
<rdar://problem/111883191>
Comment 2 Michael Saboff 2023-07-26 14:24:46 PDT
We are running several RegExp’s through the YARR interpreter.  I instrumented the RegExp engine to determine which ones and the reasons why.  There are 5 RegExp that contain variable counted parenthesis with a non-zero minimum.  All 5 of these RegExp contain one or more disjunctions like (?:\*[ \t]*){3,}.  There is one RegExp that contains a back reference that currently cannot be JIT’ed because it is a RegExp compiled for 16 bit character matching and it has the ignore case flag.

The first non-zero based variable counted parenthesis issue could be addressed at least two ways:
1) With some RegExp rewriting, e.g. changing (?:\*[ \t]*){3,} to (?:\*[ \t]*){3}(?:\*[ \t]*)*.  We currently do this for one or more variable counted parens, ie (?:\*[ \t]*)+
2) Some more involved work to properly handle the fixed non-zero count of variable counted parens in the JIT directly.

The reason we don’t support back references for ignore case 16bit JIT’ing is due to the complicated case folding rules for some Unicode characters.  Again there are two possible options for addressing this bug.
1) If the RegExp contains back references, allow the back reference if the referenced group’s contents are easily case folded.  8 bit characters would be easily handled by this fix.
2) Completely handle Unicode case folding.  This could be built upon the work of the first alternative.  A full implementation of this approach would require calling out to a case folding helper for some patterns.  This helper could be generated as needed.
Comment 3 Jarred Sumner 2023-07-26 15:08:36 PDT
Created attachment 467128 [details]
Instruments profile of "vite dev"
Comment 4 Jarred Sumner 2023-07-26 15:11:42 PDT
I added another case where JSC::Yarr::Interpreter ranks high in a profile. This one is running "vite dev". On the main thread, it's 90% of execution time. I don't have a minimal reproduction of this one yet, but what's interesting about the stack trace is it shows Yarr::Interpreter<unsigned char> which implies it's not 16 bit related. I'll try to make a simpler reproduction of this since a profile in of itself isn't very actionable

```
382.00 ms   91.6%	302.00 ms	191	 	 JSC::Yarr::Interpreter<unsigned char>::matchDisjunction(JSC::Yarr::ByteDisjunction*, JSC::Yarr::Interpreter<unsigned char>::DisjunctionContext*, bool)
62.00 ms   14.8%	44.00 ms	31	 	 JSC::Yarr::Interpreter<unsigned char>::allocParenthesesDisjunctionContext(JSC::Yarr::ByteDisjunction*, unsigned int*, JSC::Yarr::ByteTerm&)
60.00 ms   14.3%	40.00 ms	30	 	 JSC::Lexer<unsigned char>::lexWithoutClearingLineTerminator(JSC::JSToken*, WTF::OptionSet<JSC::LexerFlags>, bool)
38.00 ms    9.1%	38.00 ms	19	 	 __munmap
24.00 ms    5.7%	24.00 ms	12	 	 __openat_nocancel
20.00 ms    4.7%	20.00 ms	10	 	 __mmap
18.00 ms    4.3%	18.00 ms	9	 	 lstat
14.00 ms    3.3%	14.00 ms	7	 	 JSC::Yarr::Interpreter<unsigned char>::testCharacterClass(JSC::Yarr::CharacterClass*, int)
12.00 ms    2.8%	12.00 ms	6	 	 WTF::equal(WTF::StringImpl const*, unsigned char const*, unsigned int)
12.00 ms    2.8%	12.00 ms	6	 	 WTF::fastMalloc(unsigned long)
254.00 ms   60.9%	10.00 ms	127	 	 llint_entry
10.00 ms    2.3%	10.00 ms	5	 	 WTF::SmallSet<WTF::UniquedStringImpl*, WTF::PtrHashBase<WTF::UniquedStringImpl*, false>, 8u>::add(WTF::UniquedStringImpl*)
8.00 ms    1.9%	8.00 ms	4	 	 __getdirentries64
8.00 ms    1.9%	8.00 ms	4	 	 __psynch_cvsignal
8.00 ms    1.9%	8.00 ms	4	 	 _platform_memset
8.00 ms    1.9%	6.00 ms	4	 	 JSC::CodeBlock::updateAllNonLazyValueProfilePredictionsAndCountLiveness(JSC::ConcurrentJSLocker const&, unsigned int&, unsigned int&)
14.00 ms    3.3%	6.00 ms	7	 	 bmalloc_heap_config_specialized_local_allocator_try_allocate_small_segregated_slow
6.00 ms    1.4%	6.00 ms	3	 	 _platform_memmove
6.00 ms    1.4%	6.00 ms	3	 	 __close_nocancel
6.00 ms    1.4%	6.00 ms	3	 	 JSC::FunctionExecutable::finalizeUnconditionally(JSC::VM&, JSC::CollectionScope)
56.00 ms   13.4%	6.00 ms	28	 	 JSC::SyntaxChecker::Expression JSC::Parser<JSC::Lexer<unsigned char>>::parseMemberExpression<JSC::SyntaxChecker>(JSC::SyntaxChecker&)
26.00 ms    6.2%	4.00 ms	13	 	 JSC::ASTBuilder::Expression JSC::Parser<JSC::Lexer<unsigned char>>::parseMemberExpression<JSC::ASTBuilder>(JSC::ASTBuilder&)
12.00 ms    2.8%	4.00 ms	6	 	 JSC::Scope::collectFreeVariables(JSC::Scope*, bool)
4.00 ms    0.9%	4.00 ms	2	 	 WTF::HashTable<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>, WTF::KeyValuePair<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>, JSC::BytecodeGenerator::TDZNecessityLevel>, WTF::KeyValuePairKeyExtractor<WTF::KeyValuePair<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>, JSC::BytecodeGenerator::TDZNecessityLevel>>, JSC::IdentifierRepHash, WTF::HashMap<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>, JSC::BytecodeGenerator::TDZNecessityLevel, JSC::IdentifierRepHash, WTF::HashTraits<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>>, WTF::HashTraits<JSC::BytecodeGenerator::TDZNecessityLevel>, WTF::HashTableTraits>::KeyValuePairTraits, WTF::HashTraits<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>>>::HashTable(WTF::HashTable<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>, WTF::KeyValuePair<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>, JSC::BytecodeGenerator::TDZNecessityLevel>, WTF::KeyValuePairKeyExtractor<WTF::KeyValuePair<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>, JSC::BytecodeGenerator::TDZNecessityLevel>>, JSC::IdentifierRepHash, WTF::HashMap<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>, JSC::BytecodeGenerator::TDZNecessityLevel, JSC::IdentifierRepHash, WTF::HashTraits<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>>, WTF::HashTraits<JSC::BytecodeGenerator::TDZNecessityLevel>, WTF::HashTableTraits>::KeyValuePairTraits, WTF::HashTraits<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>>> const&)
4.00 ms    0.9%	4.00 ms	2	 	 JSC::BytecodeGenerator::needsTDZCheck(JSC::Variable const&)
4.00 ms    0.9%	4.00 ms	2	 	 JSC::ConservativeRoots::add(void*, void*, JSC::JITStubRoutineSet&, JSC::CodeBlockSet&)
4.00 ms    0.9%	4.00 ms	2	 	 pthread_getspecific
4.00 ms    0.9%	4.00 ms	2	 	 DYLD-STUB$$DYLD-STUB$$_memcpy
16.00 ms    3.8%	4.00 ms	8	 	 WTF::HashTableAddResult<WTF::HashTableIterator<WTF::HashTable<WTF::Packed<WTF::StringImpl*>, WTF::Packed<WTF::StringImpl*>, WTF::IdentityExtractor, WTF::DefaultHash<WTF::Packed<WTF::StringImpl*>>, WTF::HashTraits<WTF::Packed<WTF::StringImpl*>>, WTF::HashTraits<WTF::Packed<WTF::StringImpl*>>>, WTF::Packed<WTF::StringImpl*>, WTF::Packed<WTF::StringImpl*>, WTF::IdentityExtractor, WTF::DefaultHash<WTF::Packed<WTF::StringImpl*>>, WTF::HashTraits<WTF::Packed<WTF::StringImpl*>>, WTF::HashTraits<WTF::Packed<WTF::StringImpl*>>>> WTF::HashTable<WTF::Packed<WTF::StringImpl*>, WTF::Packed<WTF::StringImpl*>, WTF::IdentityExtractor, WTF::DefaultHash<WTF::Packed<WTF::StringImpl*>>, WTF::HashTraits<WTF::Packed<WTF::StringImpl*>>, WTF::HashTraits<WTF::Packed<WTF::StringImpl*>>>::addPassingHashCode<WTF::HashSetTranslatorAdapter<WTF::LCharBufferTranslator>, WTF::HashTranslatorCharBuffer<unsigned char> const&, WTF::HashTranslatorCharBuffer<unsigned char> const&>(WTF::HashTranslatorCharBuffer<unsigned char> const&, WTF::HashTranslatorCharBuffer<unsigned char> const&)
4.00 ms    0.9%	4.00 ms	2	 	 pas_segregated_page_construct
4.00 ms    0.9%	4.00 ms	2	 	 pas_thread_local_cache_flush_deallocation_log
4.00 ms    0.9%	4.00 ms	2	 	 JSC::MarkedBlock::Handle::stopAllocating(JSC::FreeList const&)
4.00 ms    0.9%	4.00 ms	2	 	 slow_path_enter
60.00 ms   14.3%	4.00 ms	30	 	 JSC::SyntaxChecker::Expression JSC::Parser<JSC::Lexer<unsigned char>>::parseAssignmentExpression<JSC::SyntaxChecker>(JSC::SyntaxChecker&, JSC::Parser<JSC::Lexer<unsigned char>>::ExpressionErrorClassifier&)
28.00 ms    6.7%	4.00 ms	14	 	 JSC::ASTBuilder::Expression JSC::Parser<JSC::Lexer<unsigned char>>::parseAssignmentExpression<JSC::ASTBuilder>(JSC::ASTBuilder&, JSC::Parser<JSC::Lexer<unsigned char>>::ExpressionErrorClassifier&)
4.00 ms    0.9%	4.00 ms	2	 	 0x13810aa31
4.00 ms    0.9%	4.00 ms	2	 	 WTF::StringImpl::hashSlowCase() const
2.00 ms    0.4%	2.00 ms	1	 	 WTF::fastFree(void*)
2.00 ms    0.4%	2.00 ms	1	 	 operationSizeFrameForVarargs
2.00 ms    0.4%	2.00 ms	1	 	 JSC::PolymorphicAccess::visitWeak(JSC::VM&) const
2.00 ms    0.4%	2.00 ms	1	 	 __psynch_cvwait
2.00 ms    0.4%	2.00 ms	1	 	 Bun__Path__relative
2.00 ms    0.4%	2.00 ms	1	 	 Bun__getEnvNames
124.00 ms   29.7%	2.00 ms	62	 	 JSC::Parser<JSC::Lexer<unsigned char>>::parseInner(JSC::Identifier const&, JSC::ParsingContext, std::__1::optional<int>, WTF::FixedVector<JSC::JSTextPosition> const*, WTF::HashMap<WTF::RefPtr<WTF::UniquedStringImpl, WTF::PackedPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>, JSC::PrivateNameEntry, JSC::IdentifierRepHash, WTF::HashTraits<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>>, JSC::PrivateNameEntryHashTraits, WTF::HashTableTraits> const*)
2.00 ms    0.4%	2.00 ms	1	 	 0x13804c01d
2.00 ms    0.4%	2.00 ms	1	 	 JSC::Scope::declareFunction(JSC::Identifier const*, bool, bool)
2.00 ms    0.4%	2.00 ms	1	 	 operationPutByIdStrictOptimize
2.00 ms    0.4%	2.00 ms	1	 	 src.resolver.resolve_path.normalizeStringLooseBuf__anon_778932
36.00 ms    8.6%	2.00 ms	18	 	 JSC::ASTBuilder::SourceElements JSC::Parser<JSC::Lexer<unsigned char>>::parseSourceElements<JSC::ASTBuilder>(JSC::ASTBuilder&, JSC::SourceElementsMode)
2.00 ms    0.4%	2.00 ms	1	 	 JSC::WebAssemblyModuleRecord::initializeExports(JSC::JSGlobalObject*)::$_8::operator()(unsigned int) const
2.00 ms    0.4%	2.00 ms	1	 	 JSC::Lexer<unsigned char>::~Lexer()
2.00 ms    0.4%	2.00 ms	1	 	 JSC::repatchArrayPutByVal(JSC::JSGlobalObject*, JSC::CodeBlock*, JSC::JSValue, JSC::JSValue, JSC::StructureStubInfo&, JSC::PutByKind)
2.00 ms    0.4%	2.00 ms	1	 	 JSC::BuiltinExecutables::createExecutable(JSC::VM&, JSC::SourceCode const&, JSC::Identifier const&, JSC::ImplementationVisibility, JSC::ConstructorKind, JSC::ConstructAbility, JSC::NeedsClassFieldInitializer, JSC::PrivateBrandRequirement)
812.00 ms  194.7%	2.00 ms	406	 	 vmEntryToJavaScript
2.00 ms    0.4%	2.00 ms	1	 	 JSC::BytecodeGenerator::~BytecodeGenerator()
2.00 ms    0.4%	2.00 ms	1	 	 JSC::Scope::fillParametersForSourceProviderCache(JSC::SourceProviderCacheItemCreationParameters&, WTF::SmallSet<WTF::UniquedStringImpl*, WTF::PtrHashBase<WTF::UniquedStringImpl*, false>, 8u> const&)
2.00 ms    0.4%	2.00 ms	1	 	 dyld4::PrebuiltLoader::dependent(dyld4::RuntimeState const&, unsigned int, dyld4::Loader::DependentKind*) const
4.00 ms    0.9%	2.00 ms	2	 	 src.resolver.resolve_path.normalizeStringLooseBuf__anon_802881
2.00 ms    0.4%	2.00 ms	1	 	 JSC::UnlinkedFunctionExecutable::create(JSC::VM&, JSC::SourceCode const&, JSC::FunctionMetadataNode*, JSC::UnlinkedFunctionKind, JSC::ConstructAbility, JSC::JSParserScriptMode, WTF::RefPtr<JSC::TDZEnvironmentLink, WTF::RawPtrTraits<JSC::TDZEnvironmentLink>, WTF::DefaultRefDerefTraits<JSC::TDZEnvironmentLink>>, std::__1::optional<WTF::HashMap<WTF::RefPtr<WTF::UniquedStringImpl, WTF::PackedPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>, JSC::PrivateNameEntry, JSC::IdentifierRepHash, WTF::HashTraits<WTF::RefPtr<WTF::UniquedStringImpl, WTF::RawPtrTraits<WTF::UniquedStringImpl>, WTF::DefaultRefDerefTraits<WTF::UniquedStringImpl>>>, JSC::PrivateNameEntryHashTraits, WTF::HashTableTraits>>, JSC::DerivedContextType, JSC::NeedsClassFieldInitializer, JSC::PrivateBrandRequirement, bool)
2.00 ms    0.4%	2.00 ms	1	 	 JSC::prepareChainForCaching(JSC::JSGlobalObject*, JSC::JSCell*, JSC::Structure*, WTF::UniquedStringImpl*, JSC::JSObject*)
2.00 ms    0.4%	2.00 ms	1	 	 JSC::Parser<JSC::Lexer<unsigned char>>::declareVariable(JSC::Identifier const*, JSC::DeclarationType, JSC::DeclarationImportType)
```
Comment 5 Michael Saboff 2024-03-27 08:03:38 PDT
(In reply to Michael Saboff from comment #2)

> The reason we don’t support back references for ignore case 16bit JIT’ing is
> due to the complicated case folding rules for some Unicode characters. 
> Again there are two possible options for addressing this bug.
> 1) If the RegExp contains back references, allow the back reference if the
> referenced group’s contents are easily case folded.  8 bit characters would
> be easily handled by this fix.
> 2) Completely handle Unicode case folding.  This could be built upon the
> work of the first alternative.  A full implementation of this approach would
> require calling out to a case folding helper for some patterns.  This helper
> could be generated as needed.

JIT support for ignore case backreferences on ARM64 and X86-64 landed in https://commits.webkit.org/276681@main via https://bugs.webkit.org/show_bug.cgi?id=271617.

> The first non-zero based variable counted parenthesis issue could be addressed at least two ways:
> 1) With some RegExp rewriting, e.g. changing (?:\*[ \t]*){3,} to (?:\*[ \t]*){3}(?:\*[ \t]*)*.  We currently do this for one or more variable counted parens, ie (?:\*[ \t]*)+
> 2) Some more involved work to properly handle the fixed non-zero count of variable counted parens in the JIT directly.

Working towards the second option to support non-zero based variable counted parenthesis in the Yarr JIT.