haikuwebkit/Source/WTF/wtf/LockAlgorithmInlines.h

167 lines
7.2 KiB
C
Raw Permalink Normal View History

It should be easy to decide how WebKit yields https://bugs.webkit.org/show_bug.cgi?id=174298 Reviewed by Saam Barati. Source/bmalloc: Use sched_yield() explicitly. * bmalloc/StaticMutex.cpp: (bmalloc::StaticMutex::lockSlowCase): Source/JavaScriptCore: Use the new WTF::Thread::yield() function for yielding instead of the C++ function. * heap/Heap.cpp: (JSC::Heap::resumeThePeriphery): * heap/VisitingTimeout.h: * runtime/JSCell.cpp: (JSC::JSCell::lockSlow): (JSC::JSCell::unlockSlow): * runtime/JSCell.h: * runtime/JSCellInlines.h: (JSC::JSCell::lock): (JSC::JSCell::unlock): * runtime/JSLock.cpp: (JSC::JSLock::grabAllLocks): * runtime/SamplingProfiler.cpp: Source/WebCore: No new tests because the WebCore change is just a change to how we #include things. * inspector/InspectorPageAgent.h: * inspector/TimelineRecordFactory.h: * workers/Worker.h: * workers/WorkerGlobalScopeProxy.h: * workers/WorkerMessagingProxy.h: Source/WebKitLegacy: * Storage/StorageTracker.h: Source/WTF: Created a Thread::yield() abstraction for sched_yield(), and made WTF use it everywhere that it had previously used std::this_thread::yield(). To make it less annoying to experiment with changes to the lock algorithm in the future, this also moves the meat of the algorithm into LockAlgorithmInlines.h. Only two files include that header. Since LockAlgorithm.h no longer includes ParkingLot.h, a bunch of files in WK now need to include timing headers (Seconds, MonotonicTime, etc) manually. * WTF.xcodeproj/project.pbxproj: * benchmarks/ToyLocks.h: * wtf/CMakeLists.txt: * wtf/Lock.cpp: * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockSlow): Deleted. (WTF::LockAlgorithm::unlockSlow): Deleted. * wtf/LockAlgorithmInlines.h: Added. (WTF::hasParkedBit>::lockSlow): (WTF::hasParkedBit>::unlockSlow): * wtf/MainThread.cpp: * wtf/RunLoopTimer.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPthreads.cpp: (WTF::Thread::yield): * wtf/ThreadingWin.cpp: (WTF::Thread::yield): * wtf/WordLock.cpp: (WTF::WordLockBase::lockSlow): (WTF::WordLockBase::unlockSlow): Canonical link: https://commits.webkit.org/191572@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@219763 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-07-22 14:36:18 +00:00
/*
Use constexpr instead of const in symbol definitions that are obviously constexpr. https://bugs.webkit.org/show_bug.cgi?id=201879 Rubber-stamped by Joseph Pecoraro. Source/bmalloc: * bmalloc/AvailableMemory.cpp: * bmalloc/IsoTLS.h: * bmalloc/Map.h: * bmalloc/Mutex.cpp: (bmalloc::Mutex::lockSlowCase): * bmalloc/PerThread.h: * bmalloc/Vector.h: * bmalloc/Zone.h: Source/JavaScriptCore: const may require external storage (at the compiler's whim) though these currently do not. constexpr makes it clear that the value is a literal constant that can be inlined. In most cases in the code, when we say static const, we actually mean static constexpr. I'm changing the code to reflect this. * API/JSAPIValueWrapper.h: * API/JSCallbackConstructor.h: * API/JSCallbackObject.h: * API/JSContextRef.cpp: * API/JSWrapperMap.mm: * API/tests/CompareAndSwapTest.cpp: * API/tests/TypedArrayCTest.cpp: * API/tests/testapi.mm: (testObjectiveCAPIMain): * KeywordLookupGenerator.py: (Trie.printAsC): * assembler/ARMv7Assembler.h: * assembler/AssemblerBuffer.h: * assembler/AssemblerCommon.h: * assembler/MacroAssembler.h: * assembler/MacroAssemblerARM64.h: * assembler/MacroAssemblerARM64E.h: * assembler/MacroAssemblerARMv7.h: * assembler/MacroAssemblerCodeRef.h: * assembler/MacroAssemblerMIPS.h: * assembler/MacroAssemblerX86.h: * assembler/MacroAssemblerX86Common.h: (JSC::MacroAssemblerX86Common::absDouble): (JSC::MacroAssemblerX86Common::negateDouble): * assembler/MacroAssemblerX86_64.h: * assembler/X86Assembler.h: * b3/B3Bank.h: * b3/B3CheckSpecial.h: * b3/B3DuplicateTails.cpp: * b3/B3EliminateCommonSubexpressions.cpp: * b3/B3FixSSA.cpp: * b3/B3FoldPathConstants.cpp: * b3/B3InferSwitches.cpp: * b3/B3Kind.h: * b3/B3LowerToAir.cpp: * b3/B3NativeTraits.h: * b3/B3ReduceDoubleToFloat.cpp: * b3/B3ReduceLoopStrength.cpp: * b3/B3ReduceStrength.cpp: * b3/B3ValueKey.h: * b3/air/AirAllocateRegistersByGraphColoring.cpp: * b3/air/AirAllocateStackByGraphColoring.cpp: * b3/air/AirArg.h: * b3/air/AirCCallSpecial.h: * b3/air/AirEmitShuffle.cpp: * b3/air/AirFixObviousSpills.cpp: * b3/air/AirFormTable.h: * b3/air/AirLowerAfterRegAlloc.cpp: * b3/air/AirPrintSpecial.h: * b3/air/AirStackAllocation.cpp: * b3/air/AirTmp.h: * b3/testb3_6.cpp: (testInterpreter): * bytecode/AccessCase.cpp: * bytecode/CallLinkStatus.cpp: * bytecode/CallVariant.h: * bytecode/CodeBlock.h: * bytecode/CodeOrigin.h: * bytecode/DFGExitProfile.h: * bytecode/DirectEvalCodeCache.h: * bytecode/ExecutableToCodeBlockEdge.h: * bytecode/GetterSetterAccessCase.cpp: * bytecode/LazyOperandValueProfile.h: * bytecode/ObjectPropertyCondition.h: * bytecode/ObjectPropertyConditionSet.cpp: * bytecode/PolymorphicAccess.cpp: * bytecode/PropertyCondition.h: * bytecode/SpeculatedType.h: * bytecode/StructureStubInfo.cpp: * bytecode/UnlinkedCodeBlock.cpp: (JSC::UnlinkedCodeBlock::typeProfilerExpressionInfoForBytecodeOffset): * bytecode/UnlinkedCodeBlock.h: * bytecode/UnlinkedEvalCodeBlock.h: * bytecode/UnlinkedFunctionCodeBlock.h: * bytecode/UnlinkedFunctionExecutable.h: * bytecode/UnlinkedModuleProgramCodeBlock.h: * bytecode/UnlinkedProgramCodeBlock.h: * bytecode/ValueProfile.h: * bytecode/VirtualRegister.h: * bytecode/Watchpoint.h: * bytecompiler/BytecodeGenerator.h: * bytecompiler/Label.h: * bytecompiler/NodesCodegen.cpp: (JSC::ThisNode::emitBytecode): * bytecompiler/RegisterID.h: * debugger/Breakpoint.h: * debugger/DebuggerParseData.cpp: * debugger/DebuggerPrimitives.h: * debugger/DebuggerScope.h: * dfg/DFGAbstractHeap.h: * dfg/DFGAbstractValue.h: * dfg/DFGArgumentsEliminationPhase.cpp: * dfg/DFGByteCodeParser.cpp: * dfg/DFGCSEPhase.cpp: * dfg/DFGCommon.h: * dfg/DFGCompilationKey.h: * dfg/DFGDesiredGlobalProperty.h: * dfg/DFGEdgeDominates.h: * dfg/DFGEpoch.h: * dfg/DFGForAllKills.h: (JSC::DFG::forAllKilledNodesAtNodeIndex): * dfg/DFGGraph.cpp: (JSC::DFG::Graph::isLiveInBytecode): * dfg/DFGHeapLocation.h: * dfg/DFGInPlaceAbstractState.cpp: * dfg/DFGIntegerCheckCombiningPhase.cpp: * dfg/DFGIntegerRangeOptimizationPhase.cpp: * dfg/DFGInvalidationPointInjectionPhase.cpp: * dfg/DFGLICMPhase.cpp: * dfg/DFGLazyNode.h: * dfg/DFGMinifiedID.h: * dfg/DFGMovHintRemovalPhase.cpp: * dfg/DFGNodeFlowProjection.h: * dfg/DFGNodeType.h: * dfg/DFGObjectAllocationSinkingPhase.cpp: * dfg/DFGPhantomInsertionPhase.cpp: * dfg/DFGPromotedHeapLocation.h: * dfg/DFGPropertyTypeKey.h: * dfg/DFGPureValue.h: * dfg/DFGPutStackSinkingPhase.cpp: * dfg/DFGRegisterBank.h: * dfg/DFGSSAConversionPhase.cpp: * dfg/DFGSSALoweringPhase.cpp: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::compileDoubleRep): (JSC::DFG::compileClampDoubleToByte): (JSC::DFG::SpeculativeJIT::compileArithRounding): (JSC::DFG::compileArithPowIntegerFastPath): (JSC::DFG::SpeculativeJIT::compileArithPow): (JSC::DFG::SpeculativeJIT::emitBinarySwitchStringRecurse): * dfg/DFGStackLayoutPhase.cpp: * dfg/DFGStoreBarrierInsertionPhase.cpp: * dfg/DFGStrengthReductionPhase.cpp: * dfg/DFGStructureAbstractValue.h: * dfg/DFGVarargsForwardingPhase.cpp: * dfg/DFGVariableEventStream.cpp: (JSC::DFG::VariableEventStream::reconstruct const): * dfg/DFGWatchpointCollectionPhase.cpp: * disassembler/ARM64/A64DOpcode.h: * ftl/FTLLocation.h: * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compileArithRandom): * ftl/FTLSlowPathCall.cpp: * ftl/FTLSlowPathCallKey.h: * heap/CellContainer.h: * heap/CellState.h: * heap/ConservativeRoots.h: * heap/GCSegmentedArray.h: * heap/HandleBlock.h: * heap/Heap.cpp: (JSC::Heap::updateAllocationLimits): * heap/Heap.h: * heap/HeapSnapshot.h: * heap/HeapUtil.h: (JSC::HeapUtil::findGCObjectPointersForMarking): * heap/IncrementalSweeper.cpp: * heap/LargeAllocation.h: * heap/MarkedBlock.cpp: * heap/Strong.h: * heap/VisitRaceKey.h: * heap/Weak.h: * heap/WeakBlock.h: * inspector/JSInjectedScriptHost.h: * inspector/JSInjectedScriptHostPrototype.h: * inspector/JSJavaScriptCallFrame.h: * inspector/JSJavaScriptCallFramePrototype.h: * inspector/agents/InspectorConsoleAgent.cpp: * inspector/agents/InspectorRuntimeAgent.cpp: (Inspector::InspectorRuntimeAgent::getRuntimeTypesForVariablesAtOffsets): * inspector/scripts/codegen/generate_cpp_protocol_types_header.py: (CppProtocolTypesHeaderGenerator._generate_versions): * inspector/scripts/tests/generic/expected/version.json-result: * interpreter/Interpreter.h: * interpreter/ShadowChicken.cpp: * jit/BinarySwitch.cpp: * jit/CallFrameShuffler.h: * jit/ExecutableAllocator.h: * jit/FPRInfo.h: * jit/GPRInfo.h: * jit/ICStats.h: * jit/JITThunks.h: * jit/Reg.h: * jit/RegisterSet.h: * jit/TempRegisterSet.h: * jsc.cpp: * parser/ASTBuilder.h: * parser/Nodes.h: * parser/SourceCodeKey.h: * parser/SyntaxChecker.h: * parser/VariableEnvironment.h: * profiler/ProfilerOrigin.h: * profiler/ProfilerOriginStack.h: * profiler/ProfilerUID.h: * runtime/AbstractModuleRecord.cpp: * runtime/ArrayBufferNeuteringWatchpointSet.h: * runtime/ArrayConstructor.h: * runtime/ArrayConventions.h: * runtime/ArrayIteratorPrototype.h: * runtime/ArrayPrototype.cpp: (JSC::setLength): * runtime/AsyncFromSyncIteratorPrototype.h: * runtime/AsyncGeneratorFunctionPrototype.h: * runtime/AsyncGeneratorPrototype.h: * runtime/AsyncIteratorPrototype.h: * runtime/AtomicsObject.cpp: * runtime/BigIntConstructor.h: * runtime/BigIntPrototype.h: * runtime/BooleanPrototype.h: * runtime/ClonedArguments.h: * runtime/CodeCache.h: * runtime/ControlFlowProfiler.h: * runtime/CustomGetterSetter.h: * runtime/DateConstructor.h: * runtime/DatePrototype.h: * runtime/DefinePropertyAttributes.h: * runtime/ErrorPrototype.h: * runtime/EvalExecutable.h: * runtime/Exception.h: * runtime/ExceptionHelpers.cpp: (JSC::invalidParameterInSourceAppender): (JSC::invalidParameterInstanceofSourceAppender): * runtime/ExceptionHelpers.h: * runtime/ExecutableBase.h: * runtime/FunctionExecutable.h: * runtime/FunctionRareData.h: * runtime/GeneratorPrototype.h: * runtime/GenericArguments.h: * runtime/GenericOffset.h: * runtime/GetPutInfo.h: * runtime/GetterSetter.h: * runtime/GlobalExecutable.h: * runtime/Identifier.h: * runtime/InspectorInstrumentationObject.h: * runtime/InternalFunction.h: * runtime/IntlCollatorConstructor.h: * runtime/IntlCollatorPrototype.h: * runtime/IntlDateTimeFormatConstructor.h: * runtime/IntlDateTimeFormatPrototype.h: * runtime/IntlNumberFormatConstructor.h: * runtime/IntlNumberFormatPrototype.h: * runtime/IntlObject.h: * runtime/IntlPluralRulesConstructor.h: * runtime/IntlPluralRulesPrototype.h: * runtime/IteratorPrototype.h: * runtime/JSArray.cpp: (JSC::JSArray::tryCreateUninitializedRestricted): * runtime/JSArray.h: * runtime/JSArrayBuffer.h: * runtime/JSArrayBufferView.h: * runtime/JSBigInt.h: * runtime/JSCJSValue.h: * runtime/JSCell.h: * runtime/JSCustomGetterSetterFunction.h: * runtime/JSDataView.h: * runtime/JSDataViewPrototype.h: * runtime/JSDestructibleObject.h: * runtime/JSFixedArray.h: * runtime/JSGenericTypedArrayView.h: * runtime/JSGlobalLexicalEnvironment.h: * runtime/JSGlobalObject.h: * runtime/JSImmutableButterfly.h: * runtime/JSInternalPromiseConstructor.h: * runtime/JSInternalPromiseDeferred.h: * runtime/JSInternalPromisePrototype.h: * runtime/JSLexicalEnvironment.h: * runtime/JSModuleEnvironment.h: * runtime/JSModuleLoader.h: * runtime/JSModuleNamespaceObject.h: * runtime/JSNonDestructibleProxy.h: * runtime/JSONObject.cpp: * runtime/JSONObject.h: * runtime/JSObject.h: * runtime/JSPromiseConstructor.h: * runtime/JSPromiseDeferred.h: * runtime/JSPromisePrototype.h: * runtime/JSPropertyNameEnumerator.h: * runtime/JSProxy.h: * runtime/JSScope.h: * runtime/JSScriptFetchParameters.h: * runtime/JSScriptFetcher.h: * runtime/JSSegmentedVariableObject.h: * runtime/JSSourceCode.h: * runtime/JSString.cpp: * runtime/JSString.h: * runtime/JSSymbolTableObject.h: * runtime/JSTemplateObjectDescriptor.h: * runtime/JSTypeInfo.h: * runtime/MapPrototype.h: * runtime/MinimumReservedZoneSize.h: * runtime/ModuleProgramExecutable.h: * runtime/NativeExecutable.h: * runtime/NativeFunction.h: * runtime/NativeStdFunctionCell.h: * runtime/NumberConstructor.h: * runtime/NumberPrototype.h: * runtime/ObjectConstructor.h: * runtime/ObjectPrototype.h: * runtime/ProgramExecutable.h: * runtime/PromiseDeferredTimer.cpp: * runtime/PropertyMapHashTable.h: * runtime/PropertyNameArray.h: (JSC::PropertyNameArray::add): * runtime/PrototypeKey.h: * runtime/ProxyConstructor.h: * runtime/ProxyObject.cpp: (JSC::ProxyObject::performGetOwnPropertyNames): * runtime/ProxyRevoke.h: * runtime/ReflectObject.h: * runtime/RegExp.h: * runtime/RegExpCache.h: * runtime/RegExpConstructor.h: * runtime/RegExpKey.h: * runtime/RegExpObject.h: * runtime/RegExpPrototype.h: * runtime/RegExpStringIteratorPrototype.h: * runtime/SamplingProfiler.cpp: * runtime/ScopedArgumentsTable.h: * runtime/ScriptExecutable.h: * runtime/SetPrototype.h: * runtime/SmallStrings.h: * runtime/SparseArrayValueMap.h: * runtime/StringConstructor.h: * runtime/StringIteratorPrototype.h: * runtime/StringObject.h: * runtime/StringPrototype.h: * runtime/Structure.h: * runtime/StructureChain.h: * runtime/StructureRareData.h: * runtime/StructureTransitionTable.h: * runtime/Symbol.h: * runtime/SymbolConstructor.h: * runtime/SymbolPrototype.h: * runtime/SymbolTable.h: * runtime/TemplateObjectDescriptor.h: * runtime/TypeProfiler.cpp: * runtime/TypeProfiler.h: * runtime/TypeProfilerLog.cpp: * runtime/VarOffset.h: * testRegExp.cpp: * tools/HeapVerifier.cpp: (JSC::HeapVerifier::checkIfRecorded): * tools/JSDollarVM.cpp: * wasm/WasmB3IRGenerator.cpp: * wasm/WasmBBQPlan.cpp: * wasm/WasmFaultSignalHandler.cpp: * wasm/WasmFunctionParser.h: * wasm/WasmOMGForOSREntryPlan.cpp: * wasm/WasmOMGPlan.cpp: * wasm/WasmPlan.cpp: * wasm/WasmSignature.cpp: * wasm/WasmSignature.h: * wasm/WasmWorklist.cpp: * wasm/js/JSWebAssembly.h: * wasm/js/JSWebAssemblyCodeBlock.h: * wasm/js/WebAssemblyCompileErrorConstructor.h: * wasm/js/WebAssemblyCompileErrorPrototype.h: * wasm/js/WebAssemblyFunction.h: * wasm/js/WebAssemblyInstanceConstructor.h: * wasm/js/WebAssemblyInstancePrototype.h: * wasm/js/WebAssemblyLinkErrorConstructor.h: * wasm/js/WebAssemblyLinkErrorPrototype.h: * wasm/js/WebAssemblyMemoryConstructor.h: * wasm/js/WebAssemblyMemoryPrototype.h: * wasm/js/WebAssemblyModuleConstructor.h: * wasm/js/WebAssemblyModulePrototype.h: * wasm/js/WebAssemblyRuntimeErrorConstructor.h: * wasm/js/WebAssemblyRuntimeErrorPrototype.h: * wasm/js/WebAssemblyTableConstructor.h: * wasm/js/WebAssemblyTablePrototype.h: * wasm/js/WebAssemblyToJSCallee.h: * yarr/Yarr.h: * yarr/YarrParser.h: * yarr/generateYarrCanonicalizeUnicode: Source/WebCore: No new tests. Covered by existing tests. * bindings/js/JSDOMConstructorBase.h: * bindings/js/JSDOMWindowProperties.h: * bindings/scripts/CodeGeneratorJS.pm: (GenerateHeader): (GeneratePrototypeDeclaration): * bindings/scripts/test/JS/JSTestActiveDOMObject.h: * bindings/scripts/test/JS/JSTestEnabledBySetting.h: * bindings/scripts/test/JS/JSTestEnabledForContext.h: * bindings/scripts/test/JS/JSTestEventTarget.h: * bindings/scripts/test/JS/JSTestGlobalObject.h: * bindings/scripts/test/JS/JSTestIndexedSetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestIndexedSetterThrowingException.h: * bindings/scripts/test/JS/JSTestIndexedSetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedAndIndexedSetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedAndIndexedSetterThrowingException.h: * bindings/scripts/test/JS/JSTestNamedAndIndexedSetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedDeleterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedDeleterThrowingException.h: * bindings/scripts/test/JS/JSTestNamedDeleterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedDeleterWithIndexedGetter.h: * bindings/scripts/test/JS/JSTestNamedGetterCallWith.h: * bindings/scripts/test/JS/JSTestNamedGetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedGetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedSetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedSetterThrowingException.h: * bindings/scripts/test/JS/JSTestNamedSetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedSetterWithIndexedGetter.h: * bindings/scripts/test/JS/JSTestNamedSetterWithIndexedGetterAndSetter.h: * bindings/scripts/test/JS/JSTestNamedSetterWithOverrideBuiltins.h: * bindings/scripts/test/JS/JSTestNamedSetterWithUnforgableProperties.h: * bindings/scripts/test/JS/JSTestNamedSetterWithUnforgablePropertiesAndOverrideBuiltins.h: * bindings/scripts/test/JS/JSTestObj.h: * bindings/scripts/test/JS/JSTestOverrideBuiltins.h: * bindings/scripts/test/JS/JSTestPluginInterface.h: * bindings/scripts/test/JS/JSTestTypedefs.h: * bridge/objc/objc_runtime.h: * bridge/runtime_array.h: * bridge/runtime_method.h: * bridge/runtime_object.h: Source/WebKit: * WebProcess/Plugins/Netscape/JSNPObject.h: Source/WTF: * wtf/Assertions.cpp: * wtf/AutomaticThread.cpp: * wtf/BitVector.h: * wtf/Bitmap.h: * wtf/BloomFilter.h: * wtf/Brigand.h: * wtf/CheckedArithmetic.h: * wtf/CrossThreadCopier.h: * wtf/CurrentTime.cpp: * wtf/DataLog.cpp: * wtf/DateMath.cpp: (WTF::daysFrom1970ToYear): * wtf/DeferrableRefCounted.h: * wtf/GetPtr.h: * wtf/HashFunctions.h: * wtf/HashMap.h: * wtf/HashTable.h: * wtf/HashTraits.h: * wtf/JSONValues.cpp: * wtf/JSONValues.h: * wtf/ListHashSet.h: * wtf/Lock.h: * wtf/LockAlgorithm.h: * wtf/LockAlgorithmInlines.h: (WTF::Hooks>::lockSlow): * wtf/Logger.h: * wtf/LoggerHelper.h: (WTF::LoggerHelper::childLogIdentifier const): * wtf/MainThread.cpp: * wtf/MetaAllocatorPtr.h: * wtf/MonotonicTime.h: * wtf/NaturalLoops.h: (WTF::NaturalLoops::NaturalLoops): * wtf/ObjectIdentifier.h: * wtf/RAMSize.cpp: * wtf/Ref.h: * wtf/RefPtr.h: * wtf/RetainPtr.h: * wtf/SchedulePair.h: * wtf/StackShot.h: * wtf/StdLibExtras.h: * wtf/TinyPtrSet.h: * wtf/URL.cpp: * wtf/URLHash.h: * wtf/URLParser.cpp: (WTF::URLParser::defaultPortForProtocol): * wtf/Vector.h: * wtf/VectorTraits.h: * wtf/WallTime.h: * wtf/WeakHashSet.h: * wtf/WordLock.h: * wtf/cocoa/CPUTimeCocoa.cpp: * wtf/cocoa/MemoryPressureHandlerCocoa.mm: * wtf/persistence/PersistentDecoder.h: * wtf/persistence/PersistentEncoder.h: * wtf/text/AtomStringHash.h: * wtf/text/CString.h: * wtf/text/StringBuilder.cpp: (WTF::expandedCapacity): * wtf/text/StringHash.h: * wtf/text/StringImpl.h: * wtf/text/StringToIntegerConversion.h: (WTF::toIntegralType): * wtf/text/SymbolRegistry.h: * wtf/text/TextStream.cpp: (WTF::hasFractions): * wtf/text/WTFString.h: * wtf/text/cocoa/TextBreakIteratorInternalICUCocoa.cpp: Canonical link: https://commits.webkit.org/215538@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@250005 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2019-09-18 00:36:19 +00:00
* Copyright (C) 2015-2019 Apple Inc. All rights reserved.
It should be easy to decide how WebKit yields https://bugs.webkit.org/show_bug.cgi?id=174298 Reviewed by Saam Barati. Source/bmalloc: Use sched_yield() explicitly. * bmalloc/StaticMutex.cpp: (bmalloc::StaticMutex::lockSlowCase): Source/JavaScriptCore: Use the new WTF::Thread::yield() function for yielding instead of the C++ function. * heap/Heap.cpp: (JSC::Heap::resumeThePeriphery): * heap/VisitingTimeout.h: * runtime/JSCell.cpp: (JSC::JSCell::lockSlow): (JSC::JSCell::unlockSlow): * runtime/JSCell.h: * runtime/JSCellInlines.h: (JSC::JSCell::lock): (JSC::JSCell::unlock): * runtime/JSLock.cpp: (JSC::JSLock::grabAllLocks): * runtime/SamplingProfiler.cpp: Source/WebCore: No new tests because the WebCore change is just a change to how we #include things. * inspector/InspectorPageAgent.h: * inspector/TimelineRecordFactory.h: * workers/Worker.h: * workers/WorkerGlobalScopeProxy.h: * workers/WorkerMessagingProxy.h: Source/WebKitLegacy: * Storage/StorageTracker.h: Source/WTF: Created a Thread::yield() abstraction for sched_yield(), and made WTF use it everywhere that it had previously used std::this_thread::yield(). To make it less annoying to experiment with changes to the lock algorithm in the future, this also moves the meat of the algorithm into LockAlgorithmInlines.h. Only two files include that header. Since LockAlgorithm.h no longer includes ParkingLot.h, a bunch of files in WK now need to include timing headers (Seconds, MonotonicTime, etc) manually. * WTF.xcodeproj/project.pbxproj: * benchmarks/ToyLocks.h: * wtf/CMakeLists.txt: * wtf/Lock.cpp: * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockSlow): Deleted. (WTF::LockAlgorithm::unlockSlow): Deleted. * wtf/LockAlgorithmInlines.h: Added. (WTF::hasParkedBit>::lockSlow): (WTF::hasParkedBit>::unlockSlow): * wtf/MainThread.cpp: * wtf/RunLoopTimer.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPthreads.cpp: (WTF::Thread::yield): * wtf/ThreadingWin.cpp: (WTF::Thread::yield): * wtf/WordLock.cpp: (WTF::WordLockBase::lockSlow): (WTF::WordLockBase::unlockSlow): Canonical link: https://commits.webkit.org/191572@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@219763 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-07-22 14:36:18 +00:00
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
* EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
* PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR
* CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
* EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
* PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
* PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
* OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#pragma once
GC constraint solving should be parallel https://bugs.webkit.org/show_bug.cgi?id=179934 Reviewed by JF Bastien. PerformanceTests: Added a version of splay that measures latency in a way that run-jsc-benchmarks groks. * Octane/splay.js: Added. (this.Setup.setup.setup): (this.TearDown.tearDown.tearDown): (Benchmark): (BenchmarkResult): (BenchmarkResult.prototype.valueOf): (BenchmarkSuite): (alert): (Math.random): (BenchmarkSuite.ResetRNG): (RunStep): (BenchmarkSuite.RunSuites): (BenchmarkSuite.CountBenchmarks): (BenchmarkSuite.GeometricMean): (BenchmarkSuite.GeometricMeanTime): (BenchmarkSuite.AverageAbovePercentile): (BenchmarkSuite.GeometricMeanLatency): (BenchmarkSuite.FormatScore): (BenchmarkSuite.prototype.NotifyStep): (BenchmarkSuite.prototype.NotifyResult): (BenchmarkSuite.prototype.NotifyError): (BenchmarkSuite.prototype.RunSingleBenchmark): (RunNextSetup): (RunNextBenchmark): (RunNextTearDown): (BenchmarkSuite.prototype.RunStep): (GeneratePayloadTree): (GenerateKey): (SplayUpdateStats): (InsertNewNode): (SplaySetup): (SplayTearDown): (SplayRun): (SplayTree): (SplayTree.prototype.isEmpty): (SplayTree.prototype.insert): (SplayTree.prototype.remove): (SplayTree.prototype.find): (SplayTree.prototype.findMax): (SplayTree.prototype.findGreatestLessThan): (SplayTree.prototype.exportKeys): (SplayTree.prototype.splay_): (SplayTree.Node): (SplayTree.Node.prototype.traverse_): (report): (start): Source/JavaScriptCore: This makes it possible to do constraint solving in parallel. This looks like a 1% Speedometer speed-up. It's more than 1% on trunk-Speedometer. The constraint solver supports running constraints in parallel in two different ways: - Run multiple constraints in parallel to each other. This only works for constraints that can tolerate other constraints running concurrently to them (constraint.concurrency() == ConstraintConcurrency::Concurrent). This is the most basic kind of parallelism that the constraint solver supports. All constraints except the JSC SPI constraints are concurrent. We could probably make them concurrent, but I'm playing it safe for now. - A constraint can create parallel work for itself, which the constraint solver will interleave with other stuff. A constraint can report that it has parallel work by returning ConstraintParallelism::Parallel from its executeImpl() function. Then the solver will allow that constraint's doParallelWorkImpl() function to run on as many GC marker threads as are available, for as long as that function wants to run. It's not possible to have a non-concurrent constraint that creates parallel work. The parallelism is implemented in terms of the existing GC marker threads. This turns out to be most natural for two reasons: - No need to start any other threads. - The constraints all want to be passed a SlotVisitor. Running on the marker threads means having access to those threads' SlotVisitors. Also, it means less load balancing. The solver will create work on each marking thread's SlotVisitor. When the solver is done "stealing" a marker thread, that thread will have work it can start doing immediately. Before this change, we had to contribute the work found by the constraint solver to the global worklist so that it could be distributed to the marker threads by load balancing. This change probably helps to avoid that load balancing step. A lot of this change is about making it easy to iterate GC data structures in parallel. This change makes almost all constraints parallel-enabled, but only the DOM's output constraint uses the parallel work API. That constraint iterates the marked cells in two subspaces. This change makes it very easy to compose parallel iterators over subspaces, allocators, blocks, and cells. The marked cell parallel iterator is composed out of parallel iterators for the others. A parallel iterator is just an iterator that can do an atomic next() very quickly. We abstract them using RefPtr<SharedTask<...()>>, where ... is the type returned from the iterator. We know it's done when it returns a falsish version of ... (in the current code, that's always a pointer type, so done is indicated by null). * API/JSMarkingConstraintPrivate.cpp: (JSContextGroupAddMarkingConstraint): * API/JSVirtualMachine.mm: (scanExternalObjectGraph): (scanExternalRememberedSet): * JavaScriptCore.xcodeproj/project.pbxproj: * Sources.txt: * bytecode/AccessCase.cpp: (JSC::AccessCase::propagateTransitions const): * bytecode/CodeBlock.cpp: (JSC::CodeBlock::visitWeakly): (JSC::CodeBlock::shouldJettisonDueToOldAge): (JSC::shouldMarkTransition): (JSC::CodeBlock::propagateTransitions): (JSC::CodeBlock::determineLiveness): * dfg/DFGWorklist.cpp: * ftl/FTLCompile.cpp: (JSC::FTL::compile): * heap/ConstraintParallelism.h: Added. (WTF::printInternal): * heap/Heap.cpp: (JSC::Heap::Heap): (JSC::Heap::addToRememberedSet): (JSC::Heap::runFixpointPhase): (JSC::Heap::stopThePeriphery): (JSC::Heap::resumeThePeriphery): (JSC::Heap::addCoreConstraints): (JSC::Heap::setBonusVisitorTask): (JSC::Heap::runTaskInParallel): (JSC::Heap::forEachSlotVisitor): Deleted. * heap/Heap.h: (JSC::Heap::worldIsRunning const): (JSC::Heap::runFunctionInParallel): * heap/HeapInlines.h: (JSC::Heap::worldIsStopped const): (JSC::Heap::isMarked): (JSC::Heap::incrementDeferralDepth): (JSC::Heap::decrementDeferralDepth): (JSC::Heap::decrementDeferralDepthAndGCIfNeeded): (JSC::Heap::forEachSlotVisitor): (JSC::Heap::collectorBelievesThatTheWorldIsStopped const): Deleted. (JSC::Heap::isMarkedConcurrently): Deleted. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::appendNode): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): Deleted. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::parallelNotEmptyBlockSource): * heap/MarkedAllocator.h: * heap/MarkedBlock.h: (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::areMarksStaleWithDependency): Deleted. (JSC::MarkedBlock::isMarkedConcurrently): Deleted. * heap/MarkedSpace.h: (JSC::MarkedSpace::activeWeakSetsBegin): (JSC::MarkedSpace::activeWeakSetsEnd): (JSC::MarkedSpace::newActiveWeakSetsBegin): (JSC::MarkedSpace::newActiveWeakSetsEnd): * heap/MarkingConstraint.cpp: (JSC::MarkingConstraint::MarkingConstraint): (JSC::MarkingConstraint::execute): (JSC::MarkingConstraint::quickWorkEstimate): (JSC::MarkingConstraint::workEstimate): (JSC::MarkingConstraint::doParallelWork): (JSC::MarkingConstraint::finishParallelWork): (JSC::MarkingConstraint::doParallelWorkImpl): (JSC::MarkingConstraint::finishParallelWorkImpl): * heap/MarkingConstraint.h: (JSC::MarkingConstraint::lastExecuteParallelism const): (JSC::MarkingConstraint::parallelism const): (JSC::MarkingConstraint::quickWorkEstimate): Deleted. (JSC::MarkingConstraint::workEstimate): Deleted. * heap/MarkingConstraintSet.cpp: (JSC::MarkingConstraintSet::MarkingConstraintSet): (JSC::MarkingConstraintSet::add): (JSC::MarkingConstraintSet::executeConvergence): (JSC::MarkingConstraintSet::executeConvergenceImpl): (JSC::MarkingConstraintSet::executeAll): (JSC::MarkingConstraintSet::ExecutionContext::ExecutionContext): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didVisitSomething const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::shouldTimeOut const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::drain): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didExecute const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::execute): Deleted. (): Deleted. * heap/MarkingConstraintSet.h: * heap/MarkingConstraintSolver.cpp: Added. (JSC::MarkingConstraintSolver::MarkingConstraintSolver): (JSC::MarkingConstraintSolver::~MarkingConstraintSolver): (JSC::MarkingConstraintSolver::didVisitSomething const): (JSC::MarkingConstraintSolver::execute): (JSC::MarkingConstraintSolver::drain): (JSC::MarkingConstraintSolver::converge): (JSC::MarkingConstraintSolver::runExecutionThread): (JSC::MarkingConstraintSolver::didExecute): * heap/MarkingConstraintSolver.h: Added. * heap/OpaqueRootSet.h: Removed. * heap/ParallelSourceAdapter.h: Added. (JSC::ParallelSourceAdapter::ParallelSourceAdapter): (JSC::createParallelSourceAdapter): * heap/SimpleMarkingConstraint.cpp: Added. (JSC::SimpleMarkingConstraint::SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::~SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::quickWorkEstimate): (JSC::SimpleMarkingConstraint::executeImpl): * heap/SimpleMarkingConstraint.h: Added. * heap/SlotVisitor.cpp: (JSC::SlotVisitor::didStartMarking): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::updateMutatorIsStopped): (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate const): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::performIncrementOfDraining): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): (JSC::SlotVisitor::waitForTermination): (JSC::SlotVisitor::addOpaqueRoot): Deleted. (JSC::SlotVisitor::containsOpaqueRoot const): Deleted. (JSC::SlotVisitor::containsOpaqueRootTriState const): Deleted. (JSC::SlotVisitor::mergeIfNecessary): Deleted. (JSC::SlotVisitor::mergeOpaqueRootsIfProfitable): Deleted. (JSC::SlotVisitor::mergeOpaqueRoots): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::addOpaqueRoot): (JSC::SlotVisitor::containsOpaqueRoot const): (JSC::SlotVisitor::vm): (JSC::SlotVisitor::vm const): * heap/Subspace.cpp: (JSC::Subspace::parallelAllocatorSource): (JSC::Subspace::parallelNotEmptyMarkedBlockSource): * heap/Subspace.h: * heap/SubspaceInlines.h: (JSC::Subspace::forEachMarkedCellInParallel): * heap/VisitCounter.h: Added. (JSC::VisitCounter::VisitCounter): (JSC::VisitCounter::visitCount const): * heap/VisitingTimeout.h: Removed. * heap/WeakBlock.cpp: (JSC::WeakBlock::specializedVisit): * runtime/Structure.cpp: (JSC::Structure::isCheapDuringGC): (JSC::Structure::markIfCheap): Source/WebCore: No new tests because no change in behavior. This change is best tested using DOM-GC-intensive benchmarks like Speedometer and Dromaeo. This parallelizes the DOM's output constraint, and makes some small changes to make this more scalable. * ForwardingHeaders/heap/SimpleMarkingConstraint.h: Added. * ForwardingHeaders/heap/VisitingTimeout.h: Removed. * Sources.txt: * WebCore.xcodeproj/project.pbxproj: * bindings/js/DOMGCOutputConstraint.cpp: Added. (WebCore::DOMGCOutputConstraint::DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::~DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::executeImpl): (WebCore::DOMGCOutputConstraint::doParallelWorkImpl): (WebCore::DOMGCOutputConstraint::finishParallelWorkImpl): * bindings/js/DOMGCOutputConstraint.h: Added. * bindings/js/WebCoreJSClientData.cpp: (WebCore::JSVMClientData::initNormalWorld): * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): Source/WTF: This does some changes to make it easier to do parallel constraint solving: - I finally removed dependencyWith. This was a silly construct whose only purpose is to confuse people about what it means to have a dependency chain. I took that as an opportunity to grealy simplify the GC's use of dependency chaining. - Added more logic to Deque<>, since I use it for part of the load balancer. - Made it possible to profile lock contention. See https://bugs.webkit.org/show_bug.cgi?id=180250#c0 for some preliminary measurements. - Introduced holdLockIf, which makes it easy to perform predicated lock acquisition. We use that to pick a lock in WebCore. - Introduced CountingLock. It's like WTF::Lock except it also enables optimistic read transactions sorta like Java's StampedLock. * WTF.xcodeproj/project.pbxproj: * wtf/Atomics.h: (WTF::dependency): (WTF::DependencyWith::DependencyWith): Deleted. (WTF::dependencyWith): Deleted. * wtf/BitVector.h: (WTF::BitVector::iterator::operator++): * wtf/CMakeLists.txt: * wtf/ConcurrentPtrHashSet.cpp: Added. (WTF::ConcurrentPtrHashSet::ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::~ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::deleteOldTables): (WTF::ConcurrentPtrHashSet::clear): (WTF::ConcurrentPtrHashSet::initialize): (WTF::ConcurrentPtrHashSet::addSlow): (WTF::ConcurrentPtrHashSet::resizeIfNecessary): (WTF::ConcurrentPtrHashSet::resizeAndAdd): (WTF::ConcurrentPtrHashSet::Table::create): * wtf/ConcurrentPtrHashSet.h: Added. (WTF::ConcurrentPtrHashSet::contains): (WTF::ConcurrentPtrHashSet::add): (WTF::ConcurrentPtrHashSet::size const): (WTF::ConcurrentPtrHashSet::Table::maxLoad const): (WTF::ConcurrentPtrHashSet::hash): (WTF::ConcurrentPtrHashSet::cast): (WTF::ConcurrentPtrHashSet::containsImpl const): (WTF::ConcurrentPtrHashSet::addImpl): * wtf/Deque.h: (WTF::inlineCapacity>::takeFirst): * wtf/FastMalloc.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): * wtf/Locker.h: (WTF::holdLockIf): * wtf/ScopedLambda.h: * wtf/SharedTask.h: (WTF::SharedTask<PassedResultType): (WTF::SharedTask<ResultType): Deleted. * wtf/StackShot.h: Added. (WTF::StackShot::StackShot): (WTF::StackShot::operator=): (WTF::StackShot::array const): (WTF::StackShot::size const): (WTF::StackShot::operator bool const): (WTF::StackShot::operator== const): (WTF::StackShot::hash const): (WTF::StackShot::isHashTableDeletedValue const): (WTF::StackShot::operator> const): (WTF::StackShot::deletedValueArray): (WTF::StackShotHash::hash): (WTF::StackShotHash::equal): * wtf/StackShotProfiler.h: Added. (WTF::StackShotProfiler::StackShotProfiler): (WTF::StackShotProfiler::profile): (WTF::StackShotProfiler::run): Tools: * Scripts/run-jsc-benchmarks: Add splay-latency test, since this change needed to be carefully validated with that benchmark. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/ConcurrentPtrHashSet.cpp: Added. This has unit tests of the new concurrent data structure. The tests focus on correctness under serial execution, which appears to be enough for now (it's so easy to catch a concurrency bug by just running the GC). (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/196360@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225524 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-05 17:53:57 +00:00
#include <wtf/DataLog.h>
It should be easy to decide how WebKit yields https://bugs.webkit.org/show_bug.cgi?id=174298 Reviewed by Saam Barati. Source/bmalloc: Use sched_yield() explicitly. * bmalloc/StaticMutex.cpp: (bmalloc::StaticMutex::lockSlowCase): Source/JavaScriptCore: Use the new WTF::Thread::yield() function for yielding instead of the C++ function. * heap/Heap.cpp: (JSC::Heap::resumeThePeriphery): * heap/VisitingTimeout.h: * runtime/JSCell.cpp: (JSC::JSCell::lockSlow): (JSC::JSCell::unlockSlow): * runtime/JSCell.h: * runtime/JSCellInlines.h: (JSC::JSCell::lock): (JSC::JSCell::unlock): * runtime/JSLock.cpp: (JSC::JSLock::grabAllLocks): * runtime/SamplingProfiler.cpp: Source/WebCore: No new tests because the WebCore change is just a change to how we #include things. * inspector/InspectorPageAgent.h: * inspector/TimelineRecordFactory.h: * workers/Worker.h: * workers/WorkerGlobalScopeProxy.h: * workers/WorkerMessagingProxy.h: Source/WebKitLegacy: * Storage/StorageTracker.h: Source/WTF: Created a Thread::yield() abstraction for sched_yield(), and made WTF use it everywhere that it had previously used std::this_thread::yield(). To make it less annoying to experiment with changes to the lock algorithm in the future, this also moves the meat of the algorithm into LockAlgorithmInlines.h. Only two files include that header. Since LockAlgorithm.h no longer includes ParkingLot.h, a bunch of files in WK now need to include timing headers (Seconds, MonotonicTime, etc) manually. * WTF.xcodeproj/project.pbxproj: * benchmarks/ToyLocks.h: * wtf/CMakeLists.txt: * wtf/Lock.cpp: * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockSlow): Deleted. (WTF::LockAlgorithm::unlockSlow): Deleted. * wtf/LockAlgorithmInlines.h: Added. (WTF::hasParkedBit>::lockSlow): (WTF::hasParkedBit>::unlockSlow): * wtf/MainThread.cpp: * wtf/RunLoopTimer.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPthreads.cpp: (WTF::Thread::yield): * wtf/ThreadingWin.cpp: (WTF::Thread::yield): * wtf/WordLock.cpp: (WTF::WordLockBase::lockSlow): (WTF::WordLockBase::unlockSlow): Canonical link: https://commits.webkit.org/191572@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@219763 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-07-22 14:36:18 +00:00
#include <wtf/LockAlgorithm.h>
#include <wtf/ParkingLot.h>
Merge WTFThreadData to Thread::current https://bugs.webkit.org/show_bug.cgi?id=174716 Reviewed by Mark Lam. Source/JavaScriptCore: Use Thread::current() instead. * API/JSContext.mm: (+[JSContext currentContext]): (+[JSContext currentThis]): (+[JSContext currentCallee]): (+[JSContext currentArguments]): (-[JSContext beginCallbackWithData:calleeValue:thisValue:argumentCount:arguments:]): (-[JSContext endCallbackWithData:]): * heap/Heap.cpp: (JSC::Heap::requestCollection): * runtime/Completion.cpp: (JSC::checkSyntax): (JSC::checkModuleSyntax): (JSC::evaluate): (JSC::loadAndEvaluateModule): (JSC::loadModule): (JSC::linkAndEvaluateModule): (JSC::importModule): * runtime/Identifier.cpp: (JSC::Identifier::checkCurrentAtomicStringTable): * runtime/InitializeThreading.cpp: (JSC::initializeThreading): * runtime/JSLock.cpp: (JSC::JSLock::didAcquireLock): (JSC::JSLock::willReleaseLock): (JSC::JSLock::dropAllLocks): (JSC::JSLock::grabAllLocks): * runtime/JSLock.h: * runtime/VM.cpp: (JSC::VM::VM): (JSC::VM::updateStackLimits): (JSC::VM::committedStackByteCount): * runtime/VM.h: (JSC::VM::isSafeToRecurse const): * runtime/VMEntryScope.cpp: (JSC::VMEntryScope::VMEntryScope): * runtime/VMInlines.h: (JSC::VM::ensureStackCapacityFor): * yarr/YarrPattern.cpp: (JSC::Yarr::YarrPatternConstructor::isSafeToRecurse const): Source/WebCore: Use Thread::current() instead. * fileapi/AsyncFileStream.cpp: * platform/ThreadGlobalData.cpp: (WebCore::ThreadGlobalData::ThreadGlobalData): * platform/graphics/cocoa/WebCoreDecompressionSession.h: * platform/ios/wak/WebCoreThread.mm: (StartWebThread): * workers/WorkerThread.cpp: (WebCore::WorkerThread::workerThread): Source/WTF: We placed thread specific data in WTFThreadData previously. But now, we have a new good place to put thread specific data: WTF::Thread. Before this patch, WTFThreadData and WTF::Thread sometimes have the completely same fields (m_stack etc.) due to initialization order limitations. This patch merges WTFThreadData to WTF::Thread. We apply WTFThreadData's initialization style to WTF::Thread. So, WTF::Thread's holder now uses fast TLS for darwin environment. Thus, Thread::current() access is now accelerated. And WTF::Thread::current() can be accessed even before calling WTF::initializeThreading. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/LockAlgorithm.h: * wtf/LockAlgorithmInlines.h: * wtf/MainThread.h: * wtf/ParkingLot.cpp: * wtf/StackStats.cpp: (WTF::StackStats::PerThreadStats::PerThreadStats): (WTF::StackStats::CheckPoint::CheckPoint): (WTF::StackStats::CheckPoint::~CheckPoint): (WTF::StackStats::probe): (WTF::StackStats::LayoutCheckPoint::LayoutCheckPoint): * wtf/ThreadHolder.cpp: (WTF::ThreadHolder::initializeCurrent): * wtf/ThreadHolder.h: (WTF::ThreadHolder::ThreadHolder): (WTF::ThreadHolder::currentMayBeNull): (WTF::ThreadHolder::current): * wtf/ThreadHolderPthreads.cpp: (WTF::ThreadHolder::initializeKey): (WTF::ThreadHolder::initialize): (WTF::ThreadHolder::destruct): (WTF::ThreadHolder::initializeOnce): Deleted. (WTF::ThreadHolder::current): Deleted. * wtf/ThreadHolderWin.cpp: (WTF::ThreadHolder::initializeKey): (WTF::ThreadHolder::currentDying): (WTF::ThreadHolder::initialize): (WTF::ThreadHolder::initializeOnce): Deleted. (WTF::ThreadHolder::current): Deleted. * wtf/Threading.cpp: (WTF::Thread::initializeInThread): (WTF::Thread::entryPoint): (WTF::Thread::create): (WTF::Thread::didExit): (WTF::initializeThreading): (WTF::Thread::currentMayBeNull): Deleted. * wtf/Threading.h: (WTF::Thread::current): (WTF::Thread::atomicStringTable): (WTF::Thread::setCurrentAtomicStringTable): (WTF::Thread::stackStats): (WTF::Thread::savedStackPointerAtVMEntry): (WTF::Thread::setSavedStackPointerAtVMEntry): (WTF::Thread::savedLastStackTop): (WTF::Thread::setSavedLastStackTop): * wtf/ThreadingPrimitives.h: * wtf/ThreadingPthreads.cpp: (WTF::Thread::createCurrentThread): (WTF::Thread::current): Deleted. * wtf/ThreadingWin.cpp: (WTF::Thread::createCurrentThread): (WTF::Thread::current): Deleted. * wtf/WTFThreadData.cpp: Removed. * wtf/WTFThreadData.h: Removed. * wtf/text/AtomicString.cpp: * wtf/text/AtomicStringImpl.cpp: (WTF::stringTable): * wtf/text/AtomicStringTable.cpp: (WTF::AtomicStringTable::create): * wtf/text/AtomicStringTable.h: Canonical link: https://commits.webkit.org/191874@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@220186 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-08-03 06:03:18 +00:00
#include <wtf/Threading.h>
It should be easy to decide how WebKit yields https://bugs.webkit.org/show_bug.cgi?id=174298 Reviewed by Saam Barati. Source/bmalloc: Use sched_yield() explicitly. * bmalloc/StaticMutex.cpp: (bmalloc::StaticMutex::lockSlowCase): Source/JavaScriptCore: Use the new WTF::Thread::yield() function for yielding instead of the C++ function. * heap/Heap.cpp: (JSC::Heap::resumeThePeriphery): * heap/VisitingTimeout.h: * runtime/JSCell.cpp: (JSC::JSCell::lockSlow): (JSC::JSCell::unlockSlow): * runtime/JSCell.h: * runtime/JSCellInlines.h: (JSC::JSCell::lock): (JSC::JSCell::unlock): * runtime/JSLock.cpp: (JSC::JSLock::grabAllLocks): * runtime/SamplingProfiler.cpp: Source/WebCore: No new tests because the WebCore change is just a change to how we #include things. * inspector/InspectorPageAgent.h: * inspector/TimelineRecordFactory.h: * workers/Worker.h: * workers/WorkerGlobalScopeProxy.h: * workers/WorkerMessagingProxy.h: Source/WebKitLegacy: * Storage/StorageTracker.h: Source/WTF: Created a Thread::yield() abstraction for sched_yield(), and made WTF use it everywhere that it had previously used std::this_thread::yield(). To make it less annoying to experiment with changes to the lock algorithm in the future, this also moves the meat of the algorithm into LockAlgorithmInlines.h. Only two files include that header. Since LockAlgorithm.h no longer includes ParkingLot.h, a bunch of files in WK now need to include timing headers (Seconds, MonotonicTime, etc) manually. * WTF.xcodeproj/project.pbxproj: * benchmarks/ToyLocks.h: * wtf/CMakeLists.txt: * wtf/Lock.cpp: * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockSlow): Deleted. (WTF::LockAlgorithm::unlockSlow): Deleted. * wtf/LockAlgorithmInlines.h: Added. (WTF::hasParkedBit>::lockSlow): (WTF::hasParkedBit>::unlockSlow): * wtf/MainThread.cpp: * wtf/RunLoopTimer.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPthreads.cpp: (WTF::Thread::yield): * wtf/ThreadingWin.cpp: (WTF::Thread::yield): * wtf/WordLock.cpp: (WTF::WordLockBase::lockSlow): (WTF::WordLockBase::unlockSlow): Canonical link: https://commits.webkit.org/191572@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@219763 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-07-22 14:36:18 +00:00
// It's a good idea to avoid including this header in too many places, so that it's possible to change
// the lock algorithm slow path without recompiling the world. Right now this should be included in two
// places (Lock.cpp and JSCell.cpp).
namespace WTF {
GC constraint solving should be parallel https://bugs.webkit.org/show_bug.cgi?id=179934 Reviewed by JF Bastien. PerformanceTests: Added a version of splay that measures latency in a way that run-jsc-benchmarks groks. * Octane/splay.js: Added. (this.Setup.setup.setup): (this.TearDown.tearDown.tearDown): (Benchmark): (BenchmarkResult): (BenchmarkResult.prototype.valueOf): (BenchmarkSuite): (alert): (Math.random): (BenchmarkSuite.ResetRNG): (RunStep): (BenchmarkSuite.RunSuites): (BenchmarkSuite.CountBenchmarks): (BenchmarkSuite.GeometricMean): (BenchmarkSuite.GeometricMeanTime): (BenchmarkSuite.AverageAbovePercentile): (BenchmarkSuite.GeometricMeanLatency): (BenchmarkSuite.FormatScore): (BenchmarkSuite.prototype.NotifyStep): (BenchmarkSuite.prototype.NotifyResult): (BenchmarkSuite.prototype.NotifyError): (BenchmarkSuite.prototype.RunSingleBenchmark): (RunNextSetup): (RunNextBenchmark): (RunNextTearDown): (BenchmarkSuite.prototype.RunStep): (GeneratePayloadTree): (GenerateKey): (SplayUpdateStats): (InsertNewNode): (SplaySetup): (SplayTearDown): (SplayRun): (SplayTree): (SplayTree.prototype.isEmpty): (SplayTree.prototype.insert): (SplayTree.prototype.remove): (SplayTree.prototype.find): (SplayTree.prototype.findMax): (SplayTree.prototype.findGreatestLessThan): (SplayTree.prototype.exportKeys): (SplayTree.prototype.splay_): (SplayTree.Node): (SplayTree.Node.prototype.traverse_): (report): (start): Source/JavaScriptCore: This makes it possible to do constraint solving in parallel. This looks like a 1% Speedometer speed-up. It's more than 1% on trunk-Speedometer. The constraint solver supports running constraints in parallel in two different ways: - Run multiple constraints in parallel to each other. This only works for constraints that can tolerate other constraints running concurrently to them (constraint.concurrency() == ConstraintConcurrency::Concurrent). This is the most basic kind of parallelism that the constraint solver supports. All constraints except the JSC SPI constraints are concurrent. We could probably make them concurrent, but I'm playing it safe for now. - A constraint can create parallel work for itself, which the constraint solver will interleave with other stuff. A constraint can report that it has parallel work by returning ConstraintParallelism::Parallel from its executeImpl() function. Then the solver will allow that constraint's doParallelWorkImpl() function to run on as many GC marker threads as are available, for as long as that function wants to run. It's not possible to have a non-concurrent constraint that creates parallel work. The parallelism is implemented in terms of the existing GC marker threads. This turns out to be most natural for two reasons: - No need to start any other threads. - The constraints all want to be passed a SlotVisitor. Running on the marker threads means having access to those threads' SlotVisitors. Also, it means less load balancing. The solver will create work on each marking thread's SlotVisitor. When the solver is done "stealing" a marker thread, that thread will have work it can start doing immediately. Before this change, we had to contribute the work found by the constraint solver to the global worklist so that it could be distributed to the marker threads by load balancing. This change probably helps to avoid that load balancing step. A lot of this change is about making it easy to iterate GC data structures in parallel. This change makes almost all constraints parallel-enabled, but only the DOM's output constraint uses the parallel work API. That constraint iterates the marked cells in two subspaces. This change makes it very easy to compose parallel iterators over subspaces, allocators, blocks, and cells. The marked cell parallel iterator is composed out of parallel iterators for the others. A parallel iterator is just an iterator that can do an atomic next() very quickly. We abstract them using RefPtr<SharedTask<...()>>, where ... is the type returned from the iterator. We know it's done when it returns a falsish version of ... (in the current code, that's always a pointer type, so done is indicated by null). * API/JSMarkingConstraintPrivate.cpp: (JSContextGroupAddMarkingConstraint): * API/JSVirtualMachine.mm: (scanExternalObjectGraph): (scanExternalRememberedSet): * JavaScriptCore.xcodeproj/project.pbxproj: * Sources.txt: * bytecode/AccessCase.cpp: (JSC::AccessCase::propagateTransitions const): * bytecode/CodeBlock.cpp: (JSC::CodeBlock::visitWeakly): (JSC::CodeBlock::shouldJettisonDueToOldAge): (JSC::shouldMarkTransition): (JSC::CodeBlock::propagateTransitions): (JSC::CodeBlock::determineLiveness): * dfg/DFGWorklist.cpp: * ftl/FTLCompile.cpp: (JSC::FTL::compile): * heap/ConstraintParallelism.h: Added. (WTF::printInternal): * heap/Heap.cpp: (JSC::Heap::Heap): (JSC::Heap::addToRememberedSet): (JSC::Heap::runFixpointPhase): (JSC::Heap::stopThePeriphery): (JSC::Heap::resumeThePeriphery): (JSC::Heap::addCoreConstraints): (JSC::Heap::setBonusVisitorTask): (JSC::Heap::runTaskInParallel): (JSC::Heap::forEachSlotVisitor): Deleted. * heap/Heap.h: (JSC::Heap::worldIsRunning const): (JSC::Heap::runFunctionInParallel): * heap/HeapInlines.h: (JSC::Heap::worldIsStopped const): (JSC::Heap::isMarked): (JSC::Heap::incrementDeferralDepth): (JSC::Heap::decrementDeferralDepth): (JSC::Heap::decrementDeferralDepthAndGCIfNeeded): (JSC::Heap::forEachSlotVisitor): (JSC::Heap::collectorBelievesThatTheWorldIsStopped const): Deleted. (JSC::Heap::isMarkedConcurrently): Deleted. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::appendNode): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): Deleted. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::parallelNotEmptyBlockSource): * heap/MarkedAllocator.h: * heap/MarkedBlock.h: (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::areMarksStaleWithDependency): Deleted. (JSC::MarkedBlock::isMarkedConcurrently): Deleted. * heap/MarkedSpace.h: (JSC::MarkedSpace::activeWeakSetsBegin): (JSC::MarkedSpace::activeWeakSetsEnd): (JSC::MarkedSpace::newActiveWeakSetsBegin): (JSC::MarkedSpace::newActiveWeakSetsEnd): * heap/MarkingConstraint.cpp: (JSC::MarkingConstraint::MarkingConstraint): (JSC::MarkingConstraint::execute): (JSC::MarkingConstraint::quickWorkEstimate): (JSC::MarkingConstraint::workEstimate): (JSC::MarkingConstraint::doParallelWork): (JSC::MarkingConstraint::finishParallelWork): (JSC::MarkingConstraint::doParallelWorkImpl): (JSC::MarkingConstraint::finishParallelWorkImpl): * heap/MarkingConstraint.h: (JSC::MarkingConstraint::lastExecuteParallelism const): (JSC::MarkingConstraint::parallelism const): (JSC::MarkingConstraint::quickWorkEstimate): Deleted. (JSC::MarkingConstraint::workEstimate): Deleted. * heap/MarkingConstraintSet.cpp: (JSC::MarkingConstraintSet::MarkingConstraintSet): (JSC::MarkingConstraintSet::add): (JSC::MarkingConstraintSet::executeConvergence): (JSC::MarkingConstraintSet::executeConvergenceImpl): (JSC::MarkingConstraintSet::executeAll): (JSC::MarkingConstraintSet::ExecutionContext::ExecutionContext): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didVisitSomething const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::shouldTimeOut const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::drain): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didExecute const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::execute): Deleted. (): Deleted. * heap/MarkingConstraintSet.h: * heap/MarkingConstraintSolver.cpp: Added. (JSC::MarkingConstraintSolver::MarkingConstraintSolver): (JSC::MarkingConstraintSolver::~MarkingConstraintSolver): (JSC::MarkingConstraintSolver::didVisitSomething const): (JSC::MarkingConstraintSolver::execute): (JSC::MarkingConstraintSolver::drain): (JSC::MarkingConstraintSolver::converge): (JSC::MarkingConstraintSolver::runExecutionThread): (JSC::MarkingConstraintSolver::didExecute): * heap/MarkingConstraintSolver.h: Added. * heap/OpaqueRootSet.h: Removed. * heap/ParallelSourceAdapter.h: Added. (JSC::ParallelSourceAdapter::ParallelSourceAdapter): (JSC::createParallelSourceAdapter): * heap/SimpleMarkingConstraint.cpp: Added. (JSC::SimpleMarkingConstraint::SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::~SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::quickWorkEstimate): (JSC::SimpleMarkingConstraint::executeImpl): * heap/SimpleMarkingConstraint.h: Added. * heap/SlotVisitor.cpp: (JSC::SlotVisitor::didStartMarking): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::updateMutatorIsStopped): (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate const): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::performIncrementOfDraining): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): (JSC::SlotVisitor::waitForTermination): (JSC::SlotVisitor::addOpaqueRoot): Deleted. (JSC::SlotVisitor::containsOpaqueRoot const): Deleted. (JSC::SlotVisitor::containsOpaqueRootTriState const): Deleted. (JSC::SlotVisitor::mergeIfNecessary): Deleted. (JSC::SlotVisitor::mergeOpaqueRootsIfProfitable): Deleted. (JSC::SlotVisitor::mergeOpaqueRoots): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::addOpaqueRoot): (JSC::SlotVisitor::containsOpaqueRoot const): (JSC::SlotVisitor::vm): (JSC::SlotVisitor::vm const): * heap/Subspace.cpp: (JSC::Subspace::parallelAllocatorSource): (JSC::Subspace::parallelNotEmptyMarkedBlockSource): * heap/Subspace.h: * heap/SubspaceInlines.h: (JSC::Subspace::forEachMarkedCellInParallel): * heap/VisitCounter.h: Added. (JSC::VisitCounter::VisitCounter): (JSC::VisitCounter::visitCount const): * heap/VisitingTimeout.h: Removed. * heap/WeakBlock.cpp: (JSC::WeakBlock::specializedVisit): * runtime/Structure.cpp: (JSC::Structure::isCheapDuringGC): (JSC::Structure::markIfCheap): Source/WebCore: No new tests because no change in behavior. This change is best tested using DOM-GC-intensive benchmarks like Speedometer and Dromaeo. This parallelizes the DOM's output constraint, and makes some small changes to make this more scalable. * ForwardingHeaders/heap/SimpleMarkingConstraint.h: Added. * ForwardingHeaders/heap/VisitingTimeout.h: Removed. * Sources.txt: * WebCore.xcodeproj/project.pbxproj: * bindings/js/DOMGCOutputConstraint.cpp: Added. (WebCore::DOMGCOutputConstraint::DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::~DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::executeImpl): (WebCore::DOMGCOutputConstraint::doParallelWorkImpl): (WebCore::DOMGCOutputConstraint::finishParallelWorkImpl): * bindings/js/DOMGCOutputConstraint.h: Added. * bindings/js/WebCoreJSClientData.cpp: (WebCore::JSVMClientData::initNormalWorld): * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): Source/WTF: This does some changes to make it easier to do parallel constraint solving: - I finally removed dependencyWith. This was a silly construct whose only purpose is to confuse people about what it means to have a dependency chain. I took that as an opportunity to grealy simplify the GC's use of dependency chaining. - Added more logic to Deque<>, since I use it for part of the load balancer. - Made it possible to profile lock contention. See https://bugs.webkit.org/show_bug.cgi?id=180250#c0 for some preliminary measurements. - Introduced holdLockIf, which makes it easy to perform predicated lock acquisition. We use that to pick a lock in WebCore. - Introduced CountingLock. It's like WTF::Lock except it also enables optimistic read transactions sorta like Java's StampedLock. * WTF.xcodeproj/project.pbxproj: * wtf/Atomics.h: (WTF::dependency): (WTF::DependencyWith::DependencyWith): Deleted. (WTF::dependencyWith): Deleted. * wtf/BitVector.h: (WTF::BitVector::iterator::operator++): * wtf/CMakeLists.txt: * wtf/ConcurrentPtrHashSet.cpp: Added. (WTF::ConcurrentPtrHashSet::ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::~ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::deleteOldTables): (WTF::ConcurrentPtrHashSet::clear): (WTF::ConcurrentPtrHashSet::initialize): (WTF::ConcurrentPtrHashSet::addSlow): (WTF::ConcurrentPtrHashSet::resizeIfNecessary): (WTF::ConcurrentPtrHashSet::resizeAndAdd): (WTF::ConcurrentPtrHashSet::Table::create): * wtf/ConcurrentPtrHashSet.h: Added. (WTF::ConcurrentPtrHashSet::contains): (WTF::ConcurrentPtrHashSet::add): (WTF::ConcurrentPtrHashSet::size const): (WTF::ConcurrentPtrHashSet::Table::maxLoad const): (WTF::ConcurrentPtrHashSet::hash): (WTF::ConcurrentPtrHashSet::cast): (WTF::ConcurrentPtrHashSet::containsImpl const): (WTF::ConcurrentPtrHashSet::addImpl): * wtf/Deque.h: (WTF::inlineCapacity>::takeFirst): * wtf/FastMalloc.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): * wtf/Locker.h: (WTF::holdLockIf): * wtf/ScopedLambda.h: * wtf/SharedTask.h: (WTF::SharedTask<PassedResultType): (WTF::SharedTask<ResultType): Deleted. * wtf/StackShot.h: Added. (WTF::StackShot::StackShot): (WTF::StackShot::operator=): (WTF::StackShot::array const): (WTF::StackShot::size const): (WTF::StackShot::operator bool const): (WTF::StackShot::operator== const): (WTF::StackShot::hash const): (WTF::StackShot::isHashTableDeletedValue const): (WTF::StackShot::operator> const): (WTF::StackShot::deletedValueArray): (WTF::StackShotHash::hash): (WTF::StackShotHash::equal): * wtf/StackShotProfiler.h: Added. (WTF::StackShotProfiler::StackShotProfiler): (WTF::StackShotProfiler::profile): (WTF::StackShotProfiler::run): Tools: * Scripts/run-jsc-benchmarks: Add splay-latency test, since this change needed to be carefully validated with that benchmark. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/ConcurrentPtrHashSet.cpp: Added. This has unit tests of the new concurrent data structure. The tests focus on correctness under serial execution, which appears to be enough for now (it's so easy to catch a concurrency bug by just running the GC). (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/196360@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225524 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-05 17:53:57 +00:00
template<typename LockType, LockType isHeldBit, LockType hasParkedBit, typename Hooks>
void LockAlgorithm<LockType, isHeldBit, hasParkedBit, Hooks>::lockSlow(Atomic<LockType>& lock)
It should be easy to decide how WebKit yields https://bugs.webkit.org/show_bug.cgi?id=174298 Reviewed by Saam Barati. Source/bmalloc: Use sched_yield() explicitly. * bmalloc/StaticMutex.cpp: (bmalloc::StaticMutex::lockSlowCase): Source/JavaScriptCore: Use the new WTF::Thread::yield() function for yielding instead of the C++ function. * heap/Heap.cpp: (JSC::Heap::resumeThePeriphery): * heap/VisitingTimeout.h: * runtime/JSCell.cpp: (JSC::JSCell::lockSlow): (JSC::JSCell::unlockSlow): * runtime/JSCell.h: * runtime/JSCellInlines.h: (JSC::JSCell::lock): (JSC::JSCell::unlock): * runtime/JSLock.cpp: (JSC::JSLock::grabAllLocks): * runtime/SamplingProfiler.cpp: Source/WebCore: No new tests because the WebCore change is just a change to how we #include things. * inspector/InspectorPageAgent.h: * inspector/TimelineRecordFactory.h: * workers/Worker.h: * workers/WorkerGlobalScopeProxy.h: * workers/WorkerMessagingProxy.h: Source/WebKitLegacy: * Storage/StorageTracker.h: Source/WTF: Created a Thread::yield() abstraction for sched_yield(), and made WTF use it everywhere that it had previously used std::this_thread::yield(). To make it less annoying to experiment with changes to the lock algorithm in the future, this also moves the meat of the algorithm into LockAlgorithmInlines.h. Only two files include that header. Since LockAlgorithm.h no longer includes ParkingLot.h, a bunch of files in WK now need to include timing headers (Seconds, MonotonicTime, etc) manually. * WTF.xcodeproj/project.pbxproj: * benchmarks/ToyLocks.h: * wtf/CMakeLists.txt: * wtf/Lock.cpp: * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockSlow): Deleted. (WTF::LockAlgorithm::unlockSlow): Deleted. * wtf/LockAlgorithmInlines.h: Added. (WTF::hasParkedBit>::lockSlow): (WTF::hasParkedBit>::unlockSlow): * wtf/MainThread.cpp: * wtf/RunLoopTimer.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPthreads.cpp: (WTF::Thread::yield): * wtf/ThreadingWin.cpp: (WTF::Thread::yield): * wtf/WordLock.cpp: (WTF::WordLockBase::lockSlow): (WTF::WordLockBase::unlockSlow): Canonical link: https://commits.webkit.org/191572@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@219763 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-07-22 14:36:18 +00:00
{
// This magic number turns out to be optimal based on past JikesRVM experiments.
Use constexpr instead of const in symbol definitions that are obviously constexpr. https://bugs.webkit.org/show_bug.cgi?id=201879 Rubber-stamped by Joseph Pecoraro. Source/bmalloc: * bmalloc/AvailableMemory.cpp: * bmalloc/IsoTLS.h: * bmalloc/Map.h: * bmalloc/Mutex.cpp: (bmalloc::Mutex::lockSlowCase): * bmalloc/PerThread.h: * bmalloc/Vector.h: * bmalloc/Zone.h: Source/JavaScriptCore: const may require external storage (at the compiler's whim) though these currently do not. constexpr makes it clear that the value is a literal constant that can be inlined. In most cases in the code, when we say static const, we actually mean static constexpr. I'm changing the code to reflect this. * API/JSAPIValueWrapper.h: * API/JSCallbackConstructor.h: * API/JSCallbackObject.h: * API/JSContextRef.cpp: * API/JSWrapperMap.mm: * API/tests/CompareAndSwapTest.cpp: * API/tests/TypedArrayCTest.cpp: * API/tests/testapi.mm: (testObjectiveCAPIMain): * KeywordLookupGenerator.py: (Trie.printAsC): * assembler/ARMv7Assembler.h: * assembler/AssemblerBuffer.h: * assembler/AssemblerCommon.h: * assembler/MacroAssembler.h: * assembler/MacroAssemblerARM64.h: * assembler/MacroAssemblerARM64E.h: * assembler/MacroAssemblerARMv7.h: * assembler/MacroAssemblerCodeRef.h: * assembler/MacroAssemblerMIPS.h: * assembler/MacroAssemblerX86.h: * assembler/MacroAssemblerX86Common.h: (JSC::MacroAssemblerX86Common::absDouble): (JSC::MacroAssemblerX86Common::negateDouble): * assembler/MacroAssemblerX86_64.h: * assembler/X86Assembler.h: * b3/B3Bank.h: * b3/B3CheckSpecial.h: * b3/B3DuplicateTails.cpp: * b3/B3EliminateCommonSubexpressions.cpp: * b3/B3FixSSA.cpp: * b3/B3FoldPathConstants.cpp: * b3/B3InferSwitches.cpp: * b3/B3Kind.h: * b3/B3LowerToAir.cpp: * b3/B3NativeTraits.h: * b3/B3ReduceDoubleToFloat.cpp: * b3/B3ReduceLoopStrength.cpp: * b3/B3ReduceStrength.cpp: * b3/B3ValueKey.h: * b3/air/AirAllocateRegistersByGraphColoring.cpp: * b3/air/AirAllocateStackByGraphColoring.cpp: * b3/air/AirArg.h: * b3/air/AirCCallSpecial.h: * b3/air/AirEmitShuffle.cpp: * b3/air/AirFixObviousSpills.cpp: * b3/air/AirFormTable.h: * b3/air/AirLowerAfterRegAlloc.cpp: * b3/air/AirPrintSpecial.h: * b3/air/AirStackAllocation.cpp: * b3/air/AirTmp.h: * b3/testb3_6.cpp: (testInterpreter): * bytecode/AccessCase.cpp: * bytecode/CallLinkStatus.cpp: * bytecode/CallVariant.h: * bytecode/CodeBlock.h: * bytecode/CodeOrigin.h: * bytecode/DFGExitProfile.h: * bytecode/DirectEvalCodeCache.h: * bytecode/ExecutableToCodeBlockEdge.h: * bytecode/GetterSetterAccessCase.cpp: * bytecode/LazyOperandValueProfile.h: * bytecode/ObjectPropertyCondition.h: * bytecode/ObjectPropertyConditionSet.cpp: * bytecode/PolymorphicAccess.cpp: * bytecode/PropertyCondition.h: * bytecode/SpeculatedType.h: * bytecode/StructureStubInfo.cpp: * bytecode/UnlinkedCodeBlock.cpp: (JSC::UnlinkedCodeBlock::typeProfilerExpressionInfoForBytecodeOffset): * bytecode/UnlinkedCodeBlock.h: * bytecode/UnlinkedEvalCodeBlock.h: * bytecode/UnlinkedFunctionCodeBlock.h: * bytecode/UnlinkedFunctionExecutable.h: * bytecode/UnlinkedModuleProgramCodeBlock.h: * bytecode/UnlinkedProgramCodeBlock.h: * bytecode/ValueProfile.h: * bytecode/VirtualRegister.h: * bytecode/Watchpoint.h: * bytecompiler/BytecodeGenerator.h: * bytecompiler/Label.h: * bytecompiler/NodesCodegen.cpp: (JSC::ThisNode::emitBytecode): * bytecompiler/RegisterID.h: * debugger/Breakpoint.h: * debugger/DebuggerParseData.cpp: * debugger/DebuggerPrimitives.h: * debugger/DebuggerScope.h: * dfg/DFGAbstractHeap.h: * dfg/DFGAbstractValue.h: * dfg/DFGArgumentsEliminationPhase.cpp: * dfg/DFGByteCodeParser.cpp: * dfg/DFGCSEPhase.cpp: * dfg/DFGCommon.h: * dfg/DFGCompilationKey.h: * dfg/DFGDesiredGlobalProperty.h: * dfg/DFGEdgeDominates.h: * dfg/DFGEpoch.h: * dfg/DFGForAllKills.h: (JSC::DFG::forAllKilledNodesAtNodeIndex): * dfg/DFGGraph.cpp: (JSC::DFG::Graph::isLiveInBytecode): * dfg/DFGHeapLocation.h: * dfg/DFGInPlaceAbstractState.cpp: * dfg/DFGIntegerCheckCombiningPhase.cpp: * dfg/DFGIntegerRangeOptimizationPhase.cpp: * dfg/DFGInvalidationPointInjectionPhase.cpp: * dfg/DFGLICMPhase.cpp: * dfg/DFGLazyNode.h: * dfg/DFGMinifiedID.h: * dfg/DFGMovHintRemovalPhase.cpp: * dfg/DFGNodeFlowProjection.h: * dfg/DFGNodeType.h: * dfg/DFGObjectAllocationSinkingPhase.cpp: * dfg/DFGPhantomInsertionPhase.cpp: * dfg/DFGPromotedHeapLocation.h: * dfg/DFGPropertyTypeKey.h: * dfg/DFGPureValue.h: * dfg/DFGPutStackSinkingPhase.cpp: * dfg/DFGRegisterBank.h: * dfg/DFGSSAConversionPhase.cpp: * dfg/DFGSSALoweringPhase.cpp: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::compileDoubleRep): (JSC::DFG::compileClampDoubleToByte): (JSC::DFG::SpeculativeJIT::compileArithRounding): (JSC::DFG::compileArithPowIntegerFastPath): (JSC::DFG::SpeculativeJIT::compileArithPow): (JSC::DFG::SpeculativeJIT::emitBinarySwitchStringRecurse): * dfg/DFGStackLayoutPhase.cpp: * dfg/DFGStoreBarrierInsertionPhase.cpp: * dfg/DFGStrengthReductionPhase.cpp: * dfg/DFGStructureAbstractValue.h: * dfg/DFGVarargsForwardingPhase.cpp: * dfg/DFGVariableEventStream.cpp: (JSC::DFG::VariableEventStream::reconstruct const): * dfg/DFGWatchpointCollectionPhase.cpp: * disassembler/ARM64/A64DOpcode.h: * ftl/FTLLocation.h: * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compileArithRandom): * ftl/FTLSlowPathCall.cpp: * ftl/FTLSlowPathCallKey.h: * heap/CellContainer.h: * heap/CellState.h: * heap/ConservativeRoots.h: * heap/GCSegmentedArray.h: * heap/HandleBlock.h: * heap/Heap.cpp: (JSC::Heap::updateAllocationLimits): * heap/Heap.h: * heap/HeapSnapshot.h: * heap/HeapUtil.h: (JSC::HeapUtil::findGCObjectPointersForMarking): * heap/IncrementalSweeper.cpp: * heap/LargeAllocation.h: * heap/MarkedBlock.cpp: * heap/Strong.h: * heap/VisitRaceKey.h: * heap/Weak.h: * heap/WeakBlock.h: * inspector/JSInjectedScriptHost.h: * inspector/JSInjectedScriptHostPrototype.h: * inspector/JSJavaScriptCallFrame.h: * inspector/JSJavaScriptCallFramePrototype.h: * inspector/agents/InspectorConsoleAgent.cpp: * inspector/agents/InspectorRuntimeAgent.cpp: (Inspector::InspectorRuntimeAgent::getRuntimeTypesForVariablesAtOffsets): * inspector/scripts/codegen/generate_cpp_protocol_types_header.py: (CppProtocolTypesHeaderGenerator._generate_versions): * inspector/scripts/tests/generic/expected/version.json-result: * interpreter/Interpreter.h: * interpreter/ShadowChicken.cpp: * jit/BinarySwitch.cpp: * jit/CallFrameShuffler.h: * jit/ExecutableAllocator.h: * jit/FPRInfo.h: * jit/GPRInfo.h: * jit/ICStats.h: * jit/JITThunks.h: * jit/Reg.h: * jit/RegisterSet.h: * jit/TempRegisterSet.h: * jsc.cpp: * parser/ASTBuilder.h: * parser/Nodes.h: * parser/SourceCodeKey.h: * parser/SyntaxChecker.h: * parser/VariableEnvironment.h: * profiler/ProfilerOrigin.h: * profiler/ProfilerOriginStack.h: * profiler/ProfilerUID.h: * runtime/AbstractModuleRecord.cpp: * runtime/ArrayBufferNeuteringWatchpointSet.h: * runtime/ArrayConstructor.h: * runtime/ArrayConventions.h: * runtime/ArrayIteratorPrototype.h: * runtime/ArrayPrototype.cpp: (JSC::setLength): * runtime/AsyncFromSyncIteratorPrototype.h: * runtime/AsyncGeneratorFunctionPrototype.h: * runtime/AsyncGeneratorPrototype.h: * runtime/AsyncIteratorPrototype.h: * runtime/AtomicsObject.cpp: * runtime/BigIntConstructor.h: * runtime/BigIntPrototype.h: * runtime/BooleanPrototype.h: * runtime/ClonedArguments.h: * runtime/CodeCache.h: * runtime/ControlFlowProfiler.h: * runtime/CustomGetterSetter.h: * runtime/DateConstructor.h: * runtime/DatePrototype.h: * runtime/DefinePropertyAttributes.h: * runtime/ErrorPrototype.h: * runtime/EvalExecutable.h: * runtime/Exception.h: * runtime/ExceptionHelpers.cpp: (JSC::invalidParameterInSourceAppender): (JSC::invalidParameterInstanceofSourceAppender): * runtime/ExceptionHelpers.h: * runtime/ExecutableBase.h: * runtime/FunctionExecutable.h: * runtime/FunctionRareData.h: * runtime/GeneratorPrototype.h: * runtime/GenericArguments.h: * runtime/GenericOffset.h: * runtime/GetPutInfo.h: * runtime/GetterSetter.h: * runtime/GlobalExecutable.h: * runtime/Identifier.h: * runtime/InspectorInstrumentationObject.h: * runtime/InternalFunction.h: * runtime/IntlCollatorConstructor.h: * runtime/IntlCollatorPrototype.h: * runtime/IntlDateTimeFormatConstructor.h: * runtime/IntlDateTimeFormatPrototype.h: * runtime/IntlNumberFormatConstructor.h: * runtime/IntlNumberFormatPrototype.h: * runtime/IntlObject.h: * runtime/IntlPluralRulesConstructor.h: * runtime/IntlPluralRulesPrototype.h: * runtime/IteratorPrototype.h: * runtime/JSArray.cpp: (JSC::JSArray::tryCreateUninitializedRestricted): * runtime/JSArray.h: * runtime/JSArrayBuffer.h: * runtime/JSArrayBufferView.h: * runtime/JSBigInt.h: * runtime/JSCJSValue.h: * runtime/JSCell.h: * runtime/JSCustomGetterSetterFunction.h: * runtime/JSDataView.h: * runtime/JSDataViewPrototype.h: * runtime/JSDestructibleObject.h: * runtime/JSFixedArray.h: * runtime/JSGenericTypedArrayView.h: * runtime/JSGlobalLexicalEnvironment.h: * runtime/JSGlobalObject.h: * runtime/JSImmutableButterfly.h: * runtime/JSInternalPromiseConstructor.h: * runtime/JSInternalPromiseDeferred.h: * runtime/JSInternalPromisePrototype.h: * runtime/JSLexicalEnvironment.h: * runtime/JSModuleEnvironment.h: * runtime/JSModuleLoader.h: * runtime/JSModuleNamespaceObject.h: * runtime/JSNonDestructibleProxy.h: * runtime/JSONObject.cpp: * runtime/JSONObject.h: * runtime/JSObject.h: * runtime/JSPromiseConstructor.h: * runtime/JSPromiseDeferred.h: * runtime/JSPromisePrototype.h: * runtime/JSPropertyNameEnumerator.h: * runtime/JSProxy.h: * runtime/JSScope.h: * runtime/JSScriptFetchParameters.h: * runtime/JSScriptFetcher.h: * runtime/JSSegmentedVariableObject.h: * runtime/JSSourceCode.h: * runtime/JSString.cpp: * runtime/JSString.h: * runtime/JSSymbolTableObject.h: * runtime/JSTemplateObjectDescriptor.h: * runtime/JSTypeInfo.h: * runtime/MapPrototype.h: * runtime/MinimumReservedZoneSize.h: * runtime/ModuleProgramExecutable.h: * runtime/NativeExecutable.h: * runtime/NativeFunction.h: * runtime/NativeStdFunctionCell.h: * runtime/NumberConstructor.h: * runtime/NumberPrototype.h: * runtime/ObjectConstructor.h: * runtime/ObjectPrototype.h: * runtime/ProgramExecutable.h: * runtime/PromiseDeferredTimer.cpp: * runtime/PropertyMapHashTable.h: * runtime/PropertyNameArray.h: (JSC::PropertyNameArray::add): * runtime/PrototypeKey.h: * runtime/ProxyConstructor.h: * runtime/ProxyObject.cpp: (JSC::ProxyObject::performGetOwnPropertyNames): * runtime/ProxyRevoke.h: * runtime/ReflectObject.h: * runtime/RegExp.h: * runtime/RegExpCache.h: * runtime/RegExpConstructor.h: * runtime/RegExpKey.h: * runtime/RegExpObject.h: * runtime/RegExpPrototype.h: * runtime/RegExpStringIteratorPrototype.h: * runtime/SamplingProfiler.cpp: * runtime/ScopedArgumentsTable.h: * runtime/ScriptExecutable.h: * runtime/SetPrototype.h: * runtime/SmallStrings.h: * runtime/SparseArrayValueMap.h: * runtime/StringConstructor.h: * runtime/StringIteratorPrototype.h: * runtime/StringObject.h: * runtime/StringPrototype.h: * runtime/Structure.h: * runtime/StructureChain.h: * runtime/StructureRareData.h: * runtime/StructureTransitionTable.h: * runtime/Symbol.h: * runtime/SymbolConstructor.h: * runtime/SymbolPrototype.h: * runtime/SymbolTable.h: * runtime/TemplateObjectDescriptor.h: * runtime/TypeProfiler.cpp: * runtime/TypeProfiler.h: * runtime/TypeProfilerLog.cpp: * runtime/VarOffset.h: * testRegExp.cpp: * tools/HeapVerifier.cpp: (JSC::HeapVerifier::checkIfRecorded): * tools/JSDollarVM.cpp: * wasm/WasmB3IRGenerator.cpp: * wasm/WasmBBQPlan.cpp: * wasm/WasmFaultSignalHandler.cpp: * wasm/WasmFunctionParser.h: * wasm/WasmOMGForOSREntryPlan.cpp: * wasm/WasmOMGPlan.cpp: * wasm/WasmPlan.cpp: * wasm/WasmSignature.cpp: * wasm/WasmSignature.h: * wasm/WasmWorklist.cpp: * wasm/js/JSWebAssembly.h: * wasm/js/JSWebAssemblyCodeBlock.h: * wasm/js/WebAssemblyCompileErrorConstructor.h: * wasm/js/WebAssemblyCompileErrorPrototype.h: * wasm/js/WebAssemblyFunction.h: * wasm/js/WebAssemblyInstanceConstructor.h: * wasm/js/WebAssemblyInstancePrototype.h: * wasm/js/WebAssemblyLinkErrorConstructor.h: * wasm/js/WebAssemblyLinkErrorPrototype.h: * wasm/js/WebAssemblyMemoryConstructor.h: * wasm/js/WebAssemblyMemoryPrototype.h: * wasm/js/WebAssemblyModuleConstructor.h: * wasm/js/WebAssemblyModulePrototype.h: * wasm/js/WebAssemblyRuntimeErrorConstructor.h: * wasm/js/WebAssemblyRuntimeErrorPrototype.h: * wasm/js/WebAssemblyTableConstructor.h: * wasm/js/WebAssemblyTablePrototype.h: * wasm/js/WebAssemblyToJSCallee.h: * yarr/Yarr.h: * yarr/YarrParser.h: * yarr/generateYarrCanonicalizeUnicode: Source/WebCore: No new tests. Covered by existing tests. * bindings/js/JSDOMConstructorBase.h: * bindings/js/JSDOMWindowProperties.h: * bindings/scripts/CodeGeneratorJS.pm: (GenerateHeader): (GeneratePrototypeDeclaration): * bindings/scripts/test/JS/JSTestActiveDOMObject.h: * bindings/scripts/test/JS/JSTestEnabledBySetting.h: * bindings/scripts/test/JS/JSTestEnabledForContext.h: * bindings/scripts/test/JS/JSTestEventTarget.h: * bindings/scripts/test/JS/JSTestGlobalObject.h: * bindings/scripts/test/JS/JSTestIndexedSetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestIndexedSetterThrowingException.h: * bindings/scripts/test/JS/JSTestIndexedSetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedAndIndexedSetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedAndIndexedSetterThrowingException.h: * bindings/scripts/test/JS/JSTestNamedAndIndexedSetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedDeleterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedDeleterThrowingException.h: * bindings/scripts/test/JS/JSTestNamedDeleterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedDeleterWithIndexedGetter.h: * bindings/scripts/test/JS/JSTestNamedGetterCallWith.h: * bindings/scripts/test/JS/JSTestNamedGetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedGetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedSetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedSetterThrowingException.h: * bindings/scripts/test/JS/JSTestNamedSetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedSetterWithIndexedGetter.h: * bindings/scripts/test/JS/JSTestNamedSetterWithIndexedGetterAndSetter.h: * bindings/scripts/test/JS/JSTestNamedSetterWithOverrideBuiltins.h: * bindings/scripts/test/JS/JSTestNamedSetterWithUnforgableProperties.h: * bindings/scripts/test/JS/JSTestNamedSetterWithUnforgablePropertiesAndOverrideBuiltins.h: * bindings/scripts/test/JS/JSTestObj.h: * bindings/scripts/test/JS/JSTestOverrideBuiltins.h: * bindings/scripts/test/JS/JSTestPluginInterface.h: * bindings/scripts/test/JS/JSTestTypedefs.h: * bridge/objc/objc_runtime.h: * bridge/runtime_array.h: * bridge/runtime_method.h: * bridge/runtime_object.h: Source/WebKit: * WebProcess/Plugins/Netscape/JSNPObject.h: Source/WTF: * wtf/Assertions.cpp: * wtf/AutomaticThread.cpp: * wtf/BitVector.h: * wtf/Bitmap.h: * wtf/BloomFilter.h: * wtf/Brigand.h: * wtf/CheckedArithmetic.h: * wtf/CrossThreadCopier.h: * wtf/CurrentTime.cpp: * wtf/DataLog.cpp: * wtf/DateMath.cpp: (WTF::daysFrom1970ToYear): * wtf/DeferrableRefCounted.h: * wtf/GetPtr.h: * wtf/HashFunctions.h: * wtf/HashMap.h: * wtf/HashTable.h: * wtf/HashTraits.h: * wtf/JSONValues.cpp: * wtf/JSONValues.h: * wtf/ListHashSet.h: * wtf/Lock.h: * wtf/LockAlgorithm.h: * wtf/LockAlgorithmInlines.h: (WTF::Hooks>::lockSlow): * wtf/Logger.h: * wtf/LoggerHelper.h: (WTF::LoggerHelper::childLogIdentifier const): * wtf/MainThread.cpp: * wtf/MetaAllocatorPtr.h: * wtf/MonotonicTime.h: * wtf/NaturalLoops.h: (WTF::NaturalLoops::NaturalLoops): * wtf/ObjectIdentifier.h: * wtf/RAMSize.cpp: * wtf/Ref.h: * wtf/RefPtr.h: * wtf/RetainPtr.h: * wtf/SchedulePair.h: * wtf/StackShot.h: * wtf/StdLibExtras.h: * wtf/TinyPtrSet.h: * wtf/URL.cpp: * wtf/URLHash.h: * wtf/URLParser.cpp: (WTF::URLParser::defaultPortForProtocol): * wtf/Vector.h: * wtf/VectorTraits.h: * wtf/WallTime.h: * wtf/WeakHashSet.h: * wtf/WordLock.h: * wtf/cocoa/CPUTimeCocoa.cpp: * wtf/cocoa/MemoryPressureHandlerCocoa.mm: * wtf/persistence/PersistentDecoder.h: * wtf/persistence/PersistentEncoder.h: * wtf/text/AtomStringHash.h: * wtf/text/CString.h: * wtf/text/StringBuilder.cpp: (WTF::expandedCapacity): * wtf/text/StringHash.h: * wtf/text/StringImpl.h: * wtf/text/StringToIntegerConversion.h: (WTF::toIntegralType): * wtf/text/SymbolRegistry.h: * wtf/text/TextStream.cpp: (WTF::hasFractions): * wtf/text/WTFString.h: * wtf/text/cocoa/TextBreakIteratorInternalICUCocoa.cpp: Canonical link: https://commits.webkit.org/215538@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@250005 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2019-09-18 00:36:19 +00:00
static constexpr unsigned spinLimit = 40;
It should be easy to decide how WebKit yields https://bugs.webkit.org/show_bug.cgi?id=174298 Reviewed by Saam Barati. Source/bmalloc: Use sched_yield() explicitly. * bmalloc/StaticMutex.cpp: (bmalloc::StaticMutex::lockSlowCase): Source/JavaScriptCore: Use the new WTF::Thread::yield() function for yielding instead of the C++ function. * heap/Heap.cpp: (JSC::Heap::resumeThePeriphery): * heap/VisitingTimeout.h: * runtime/JSCell.cpp: (JSC::JSCell::lockSlow): (JSC::JSCell::unlockSlow): * runtime/JSCell.h: * runtime/JSCellInlines.h: (JSC::JSCell::lock): (JSC::JSCell::unlock): * runtime/JSLock.cpp: (JSC::JSLock::grabAllLocks): * runtime/SamplingProfiler.cpp: Source/WebCore: No new tests because the WebCore change is just a change to how we #include things. * inspector/InspectorPageAgent.h: * inspector/TimelineRecordFactory.h: * workers/Worker.h: * workers/WorkerGlobalScopeProxy.h: * workers/WorkerMessagingProxy.h: Source/WebKitLegacy: * Storage/StorageTracker.h: Source/WTF: Created a Thread::yield() abstraction for sched_yield(), and made WTF use it everywhere that it had previously used std::this_thread::yield(). To make it less annoying to experiment with changes to the lock algorithm in the future, this also moves the meat of the algorithm into LockAlgorithmInlines.h. Only two files include that header. Since LockAlgorithm.h no longer includes ParkingLot.h, a bunch of files in WK now need to include timing headers (Seconds, MonotonicTime, etc) manually. * WTF.xcodeproj/project.pbxproj: * benchmarks/ToyLocks.h: * wtf/CMakeLists.txt: * wtf/Lock.cpp: * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockSlow): Deleted. (WTF::LockAlgorithm::unlockSlow): Deleted. * wtf/LockAlgorithmInlines.h: Added. (WTF::hasParkedBit>::lockSlow): (WTF::hasParkedBit>::unlockSlow): * wtf/MainThread.cpp: * wtf/RunLoopTimer.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPthreads.cpp: (WTF::Thread::yield): * wtf/ThreadingWin.cpp: (WTF::Thread::yield): * wtf/WordLock.cpp: (WTF::WordLockBase::lockSlow): (WTF::WordLockBase::unlockSlow): Canonical link: https://commits.webkit.org/191572@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@219763 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-07-22 14:36:18 +00:00
unsigned spinCount = 0;
for (;;) {
GC constraint solving should be parallel https://bugs.webkit.org/show_bug.cgi?id=179934 Reviewed by JF Bastien. PerformanceTests: Added a version of splay that measures latency in a way that run-jsc-benchmarks groks. * Octane/splay.js: Added. (this.Setup.setup.setup): (this.TearDown.tearDown.tearDown): (Benchmark): (BenchmarkResult): (BenchmarkResult.prototype.valueOf): (BenchmarkSuite): (alert): (Math.random): (BenchmarkSuite.ResetRNG): (RunStep): (BenchmarkSuite.RunSuites): (BenchmarkSuite.CountBenchmarks): (BenchmarkSuite.GeometricMean): (BenchmarkSuite.GeometricMeanTime): (BenchmarkSuite.AverageAbovePercentile): (BenchmarkSuite.GeometricMeanLatency): (BenchmarkSuite.FormatScore): (BenchmarkSuite.prototype.NotifyStep): (BenchmarkSuite.prototype.NotifyResult): (BenchmarkSuite.prototype.NotifyError): (BenchmarkSuite.prototype.RunSingleBenchmark): (RunNextSetup): (RunNextBenchmark): (RunNextTearDown): (BenchmarkSuite.prototype.RunStep): (GeneratePayloadTree): (GenerateKey): (SplayUpdateStats): (InsertNewNode): (SplaySetup): (SplayTearDown): (SplayRun): (SplayTree): (SplayTree.prototype.isEmpty): (SplayTree.prototype.insert): (SplayTree.prototype.remove): (SplayTree.prototype.find): (SplayTree.prototype.findMax): (SplayTree.prototype.findGreatestLessThan): (SplayTree.prototype.exportKeys): (SplayTree.prototype.splay_): (SplayTree.Node): (SplayTree.Node.prototype.traverse_): (report): (start): Source/JavaScriptCore: This makes it possible to do constraint solving in parallel. This looks like a 1% Speedometer speed-up. It's more than 1% on trunk-Speedometer. The constraint solver supports running constraints in parallel in two different ways: - Run multiple constraints in parallel to each other. This only works for constraints that can tolerate other constraints running concurrently to them (constraint.concurrency() == ConstraintConcurrency::Concurrent). This is the most basic kind of parallelism that the constraint solver supports. All constraints except the JSC SPI constraints are concurrent. We could probably make them concurrent, but I'm playing it safe for now. - A constraint can create parallel work for itself, which the constraint solver will interleave with other stuff. A constraint can report that it has parallel work by returning ConstraintParallelism::Parallel from its executeImpl() function. Then the solver will allow that constraint's doParallelWorkImpl() function to run on as many GC marker threads as are available, for as long as that function wants to run. It's not possible to have a non-concurrent constraint that creates parallel work. The parallelism is implemented in terms of the existing GC marker threads. This turns out to be most natural for two reasons: - No need to start any other threads. - The constraints all want to be passed a SlotVisitor. Running on the marker threads means having access to those threads' SlotVisitors. Also, it means less load balancing. The solver will create work on each marking thread's SlotVisitor. When the solver is done "stealing" a marker thread, that thread will have work it can start doing immediately. Before this change, we had to contribute the work found by the constraint solver to the global worklist so that it could be distributed to the marker threads by load balancing. This change probably helps to avoid that load balancing step. A lot of this change is about making it easy to iterate GC data structures in parallel. This change makes almost all constraints parallel-enabled, but only the DOM's output constraint uses the parallel work API. That constraint iterates the marked cells in two subspaces. This change makes it very easy to compose parallel iterators over subspaces, allocators, blocks, and cells. The marked cell parallel iterator is composed out of parallel iterators for the others. A parallel iterator is just an iterator that can do an atomic next() very quickly. We abstract them using RefPtr<SharedTask<...()>>, where ... is the type returned from the iterator. We know it's done when it returns a falsish version of ... (in the current code, that's always a pointer type, so done is indicated by null). * API/JSMarkingConstraintPrivate.cpp: (JSContextGroupAddMarkingConstraint): * API/JSVirtualMachine.mm: (scanExternalObjectGraph): (scanExternalRememberedSet): * JavaScriptCore.xcodeproj/project.pbxproj: * Sources.txt: * bytecode/AccessCase.cpp: (JSC::AccessCase::propagateTransitions const): * bytecode/CodeBlock.cpp: (JSC::CodeBlock::visitWeakly): (JSC::CodeBlock::shouldJettisonDueToOldAge): (JSC::shouldMarkTransition): (JSC::CodeBlock::propagateTransitions): (JSC::CodeBlock::determineLiveness): * dfg/DFGWorklist.cpp: * ftl/FTLCompile.cpp: (JSC::FTL::compile): * heap/ConstraintParallelism.h: Added. (WTF::printInternal): * heap/Heap.cpp: (JSC::Heap::Heap): (JSC::Heap::addToRememberedSet): (JSC::Heap::runFixpointPhase): (JSC::Heap::stopThePeriphery): (JSC::Heap::resumeThePeriphery): (JSC::Heap::addCoreConstraints): (JSC::Heap::setBonusVisitorTask): (JSC::Heap::runTaskInParallel): (JSC::Heap::forEachSlotVisitor): Deleted. * heap/Heap.h: (JSC::Heap::worldIsRunning const): (JSC::Heap::runFunctionInParallel): * heap/HeapInlines.h: (JSC::Heap::worldIsStopped const): (JSC::Heap::isMarked): (JSC::Heap::incrementDeferralDepth): (JSC::Heap::decrementDeferralDepth): (JSC::Heap::decrementDeferralDepthAndGCIfNeeded): (JSC::Heap::forEachSlotVisitor): (JSC::Heap::collectorBelievesThatTheWorldIsStopped const): Deleted. (JSC::Heap::isMarkedConcurrently): Deleted. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::appendNode): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): Deleted. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::parallelNotEmptyBlockSource): * heap/MarkedAllocator.h: * heap/MarkedBlock.h: (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::areMarksStaleWithDependency): Deleted. (JSC::MarkedBlock::isMarkedConcurrently): Deleted. * heap/MarkedSpace.h: (JSC::MarkedSpace::activeWeakSetsBegin): (JSC::MarkedSpace::activeWeakSetsEnd): (JSC::MarkedSpace::newActiveWeakSetsBegin): (JSC::MarkedSpace::newActiveWeakSetsEnd): * heap/MarkingConstraint.cpp: (JSC::MarkingConstraint::MarkingConstraint): (JSC::MarkingConstraint::execute): (JSC::MarkingConstraint::quickWorkEstimate): (JSC::MarkingConstraint::workEstimate): (JSC::MarkingConstraint::doParallelWork): (JSC::MarkingConstraint::finishParallelWork): (JSC::MarkingConstraint::doParallelWorkImpl): (JSC::MarkingConstraint::finishParallelWorkImpl): * heap/MarkingConstraint.h: (JSC::MarkingConstraint::lastExecuteParallelism const): (JSC::MarkingConstraint::parallelism const): (JSC::MarkingConstraint::quickWorkEstimate): Deleted. (JSC::MarkingConstraint::workEstimate): Deleted. * heap/MarkingConstraintSet.cpp: (JSC::MarkingConstraintSet::MarkingConstraintSet): (JSC::MarkingConstraintSet::add): (JSC::MarkingConstraintSet::executeConvergence): (JSC::MarkingConstraintSet::executeConvergenceImpl): (JSC::MarkingConstraintSet::executeAll): (JSC::MarkingConstraintSet::ExecutionContext::ExecutionContext): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didVisitSomething const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::shouldTimeOut const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::drain): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didExecute const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::execute): Deleted. (): Deleted. * heap/MarkingConstraintSet.h: * heap/MarkingConstraintSolver.cpp: Added. (JSC::MarkingConstraintSolver::MarkingConstraintSolver): (JSC::MarkingConstraintSolver::~MarkingConstraintSolver): (JSC::MarkingConstraintSolver::didVisitSomething const): (JSC::MarkingConstraintSolver::execute): (JSC::MarkingConstraintSolver::drain): (JSC::MarkingConstraintSolver::converge): (JSC::MarkingConstraintSolver::runExecutionThread): (JSC::MarkingConstraintSolver::didExecute): * heap/MarkingConstraintSolver.h: Added. * heap/OpaqueRootSet.h: Removed. * heap/ParallelSourceAdapter.h: Added. (JSC::ParallelSourceAdapter::ParallelSourceAdapter): (JSC::createParallelSourceAdapter): * heap/SimpleMarkingConstraint.cpp: Added. (JSC::SimpleMarkingConstraint::SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::~SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::quickWorkEstimate): (JSC::SimpleMarkingConstraint::executeImpl): * heap/SimpleMarkingConstraint.h: Added. * heap/SlotVisitor.cpp: (JSC::SlotVisitor::didStartMarking): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::updateMutatorIsStopped): (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate const): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::performIncrementOfDraining): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): (JSC::SlotVisitor::waitForTermination): (JSC::SlotVisitor::addOpaqueRoot): Deleted. (JSC::SlotVisitor::containsOpaqueRoot const): Deleted. (JSC::SlotVisitor::containsOpaqueRootTriState const): Deleted. (JSC::SlotVisitor::mergeIfNecessary): Deleted. (JSC::SlotVisitor::mergeOpaqueRootsIfProfitable): Deleted. (JSC::SlotVisitor::mergeOpaqueRoots): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::addOpaqueRoot): (JSC::SlotVisitor::containsOpaqueRoot const): (JSC::SlotVisitor::vm): (JSC::SlotVisitor::vm const): * heap/Subspace.cpp: (JSC::Subspace::parallelAllocatorSource): (JSC::Subspace::parallelNotEmptyMarkedBlockSource): * heap/Subspace.h: * heap/SubspaceInlines.h: (JSC::Subspace::forEachMarkedCellInParallel): * heap/VisitCounter.h: Added. (JSC::VisitCounter::VisitCounter): (JSC::VisitCounter::visitCount const): * heap/VisitingTimeout.h: Removed. * heap/WeakBlock.cpp: (JSC::WeakBlock::specializedVisit): * runtime/Structure.cpp: (JSC::Structure::isCheapDuringGC): (JSC::Structure::markIfCheap): Source/WebCore: No new tests because no change in behavior. This change is best tested using DOM-GC-intensive benchmarks like Speedometer and Dromaeo. This parallelizes the DOM's output constraint, and makes some small changes to make this more scalable. * ForwardingHeaders/heap/SimpleMarkingConstraint.h: Added. * ForwardingHeaders/heap/VisitingTimeout.h: Removed. * Sources.txt: * WebCore.xcodeproj/project.pbxproj: * bindings/js/DOMGCOutputConstraint.cpp: Added. (WebCore::DOMGCOutputConstraint::DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::~DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::executeImpl): (WebCore::DOMGCOutputConstraint::doParallelWorkImpl): (WebCore::DOMGCOutputConstraint::finishParallelWorkImpl): * bindings/js/DOMGCOutputConstraint.h: Added. * bindings/js/WebCoreJSClientData.cpp: (WebCore::JSVMClientData::initNormalWorld): * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): Source/WTF: This does some changes to make it easier to do parallel constraint solving: - I finally removed dependencyWith. This was a silly construct whose only purpose is to confuse people about what it means to have a dependency chain. I took that as an opportunity to grealy simplify the GC's use of dependency chaining. - Added more logic to Deque<>, since I use it for part of the load balancer. - Made it possible to profile lock contention. See https://bugs.webkit.org/show_bug.cgi?id=180250#c0 for some preliminary measurements. - Introduced holdLockIf, which makes it easy to perform predicated lock acquisition. We use that to pick a lock in WebCore. - Introduced CountingLock. It's like WTF::Lock except it also enables optimistic read transactions sorta like Java's StampedLock. * WTF.xcodeproj/project.pbxproj: * wtf/Atomics.h: (WTF::dependency): (WTF::DependencyWith::DependencyWith): Deleted. (WTF::dependencyWith): Deleted. * wtf/BitVector.h: (WTF::BitVector::iterator::operator++): * wtf/CMakeLists.txt: * wtf/ConcurrentPtrHashSet.cpp: Added. (WTF::ConcurrentPtrHashSet::ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::~ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::deleteOldTables): (WTF::ConcurrentPtrHashSet::clear): (WTF::ConcurrentPtrHashSet::initialize): (WTF::ConcurrentPtrHashSet::addSlow): (WTF::ConcurrentPtrHashSet::resizeIfNecessary): (WTF::ConcurrentPtrHashSet::resizeAndAdd): (WTF::ConcurrentPtrHashSet::Table::create): * wtf/ConcurrentPtrHashSet.h: Added. (WTF::ConcurrentPtrHashSet::contains): (WTF::ConcurrentPtrHashSet::add): (WTF::ConcurrentPtrHashSet::size const): (WTF::ConcurrentPtrHashSet::Table::maxLoad const): (WTF::ConcurrentPtrHashSet::hash): (WTF::ConcurrentPtrHashSet::cast): (WTF::ConcurrentPtrHashSet::containsImpl const): (WTF::ConcurrentPtrHashSet::addImpl): * wtf/Deque.h: (WTF::inlineCapacity>::takeFirst): * wtf/FastMalloc.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): * wtf/Locker.h: (WTF::holdLockIf): * wtf/ScopedLambda.h: * wtf/SharedTask.h: (WTF::SharedTask<PassedResultType): (WTF::SharedTask<ResultType): Deleted. * wtf/StackShot.h: Added. (WTF::StackShot::StackShot): (WTF::StackShot::operator=): (WTF::StackShot::array const): (WTF::StackShot::size const): (WTF::StackShot::operator bool const): (WTF::StackShot::operator== const): (WTF::StackShot::hash const): (WTF::StackShot::isHashTableDeletedValue const): (WTF::StackShot::operator> const): (WTF::StackShot::deletedValueArray): (WTF::StackShotHash::hash): (WTF::StackShotHash::equal): * wtf/StackShotProfiler.h: Added. (WTF::StackShotProfiler::StackShotProfiler): (WTF::StackShotProfiler::profile): (WTF::StackShotProfiler::run): Tools: * Scripts/run-jsc-benchmarks: Add splay-latency test, since this change needed to be carefully validated with that benchmark. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/ConcurrentPtrHashSet.cpp: Added. This has unit tests of the new concurrent data structure. The tests focus on correctness under serial execution, which appears to be enough for now (it's so easy to catch a concurrency bug by just running the GC). (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/196360@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225524 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-05 17:53:57 +00:00
LockType currentValue = lock.load();
It should be easy to decide how WebKit yields https://bugs.webkit.org/show_bug.cgi?id=174298 Reviewed by Saam Barati. Source/bmalloc: Use sched_yield() explicitly. * bmalloc/StaticMutex.cpp: (bmalloc::StaticMutex::lockSlowCase): Source/JavaScriptCore: Use the new WTF::Thread::yield() function for yielding instead of the C++ function. * heap/Heap.cpp: (JSC::Heap::resumeThePeriphery): * heap/VisitingTimeout.h: * runtime/JSCell.cpp: (JSC::JSCell::lockSlow): (JSC::JSCell::unlockSlow): * runtime/JSCell.h: * runtime/JSCellInlines.h: (JSC::JSCell::lock): (JSC::JSCell::unlock): * runtime/JSLock.cpp: (JSC::JSLock::grabAllLocks): * runtime/SamplingProfiler.cpp: Source/WebCore: No new tests because the WebCore change is just a change to how we #include things. * inspector/InspectorPageAgent.h: * inspector/TimelineRecordFactory.h: * workers/Worker.h: * workers/WorkerGlobalScopeProxy.h: * workers/WorkerMessagingProxy.h: Source/WebKitLegacy: * Storage/StorageTracker.h: Source/WTF: Created a Thread::yield() abstraction for sched_yield(), and made WTF use it everywhere that it had previously used std::this_thread::yield(). To make it less annoying to experiment with changes to the lock algorithm in the future, this also moves the meat of the algorithm into LockAlgorithmInlines.h. Only two files include that header. Since LockAlgorithm.h no longer includes ParkingLot.h, a bunch of files in WK now need to include timing headers (Seconds, MonotonicTime, etc) manually. * WTF.xcodeproj/project.pbxproj: * benchmarks/ToyLocks.h: * wtf/CMakeLists.txt: * wtf/Lock.cpp: * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockSlow): Deleted. (WTF::LockAlgorithm::unlockSlow): Deleted. * wtf/LockAlgorithmInlines.h: Added. (WTF::hasParkedBit>::lockSlow): (WTF::hasParkedBit>::unlockSlow): * wtf/MainThread.cpp: * wtf/RunLoopTimer.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPthreads.cpp: (WTF::Thread::yield): * wtf/ThreadingWin.cpp: (WTF::Thread::yield): * wtf/WordLock.cpp: (WTF::WordLockBase::lockSlow): (WTF::WordLockBase::unlockSlow): Canonical link: https://commits.webkit.org/191572@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@219763 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-07-22 14:36:18 +00:00
// We allow ourselves to barge in.
GC constraint solving should be parallel https://bugs.webkit.org/show_bug.cgi?id=179934 Reviewed by JF Bastien. PerformanceTests: Added a version of splay that measures latency in a way that run-jsc-benchmarks groks. * Octane/splay.js: Added. (this.Setup.setup.setup): (this.TearDown.tearDown.tearDown): (Benchmark): (BenchmarkResult): (BenchmarkResult.prototype.valueOf): (BenchmarkSuite): (alert): (Math.random): (BenchmarkSuite.ResetRNG): (RunStep): (BenchmarkSuite.RunSuites): (BenchmarkSuite.CountBenchmarks): (BenchmarkSuite.GeometricMean): (BenchmarkSuite.GeometricMeanTime): (BenchmarkSuite.AverageAbovePercentile): (BenchmarkSuite.GeometricMeanLatency): (BenchmarkSuite.FormatScore): (BenchmarkSuite.prototype.NotifyStep): (BenchmarkSuite.prototype.NotifyResult): (BenchmarkSuite.prototype.NotifyError): (BenchmarkSuite.prototype.RunSingleBenchmark): (RunNextSetup): (RunNextBenchmark): (RunNextTearDown): (BenchmarkSuite.prototype.RunStep): (GeneratePayloadTree): (GenerateKey): (SplayUpdateStats): (InsertNewNode): (SplaySetup): (SplayTearDown): (SplayRun): (SplayTree): (SplayTree.prototype.isEmpty): (SplayTree.prototype.insert): (SplayTree.prototype.remove): (SplayTree.prototype.find): (SplayTree.prototype.findMax): (SplayTree.prototype.findGreatestLessThan): (SplayTree.prototype.exportKeys): (SplayTree.prototype.splay_): (SplayTree.Node): (SplayTree.Node.prototype.traverse_): (report): (start): Source/JavaScriptCore: This makes it possible to do constraint solving in parallel. This looks like a 1% Speedometer speed-up. It's more than 1% on trunk-Speedometer. The constraint solver supports running constraints in parallel in two different ways: - Run multiple constraints in parallel to each other. This only works for constraints that can tolerate other constraints running concurrently to them (constraint.concurrency() == ConstraintConcurrency::Concurrent). This is the most basic kind of parallelism that the constraint solver supports. All constraints except the JSC SPI constraints are concurrent. We could probably make them concurrent, but I'm playing it safe for now. - A constraint can create parallel work for itself, which the constraint solver will interleave with other stuff. A constraint can report that it has parallel work by returning ConstraintParallelism::Parallel from its executeImpl() function. Then the solver will allow that constraint's doParallelWorkImpl() function to run on as many GC marker threads as are available, for as long as that function wants to run. It's not possible to have a non-concurrent constraint that creates parallel work. The parallelism is implemented in terms of the existing GC marker threads. This turns out to be most natural for two reasons: - No need to start any other threads. - The constraints all want to be passed a SlotVisitor. Running on the marker threads means having access to those threads' SlotVisitors. Also, it means less load balancing. The solver will create work on each marking thread's SlotVisitor. When the solver is done "stealing" a marker thread, that thread will have work it can start doing immediately. Before this change, we had to contribute the work found by the constraint solver to the global worklist so that it could be distributed to the marker threads by load balancing. This change probably helps to avoid that load balancing step. A lot of this change is about making it easy to iterate GC data structures in parallel. This change makes almost all constraints parallel-enabled, but only the DOM's output constraint uses the parallel work API. That constraint iterates the marked cells in two subspaces. This change makes it very easy to compose parallel iterators over subspaces, allocators, blocks, and cells. The marked cell parallel iterator is composed out of parallel iterators for the others. A parallel iterator is just an iterator that can do an atomic next() very quickly. We abstract them using RefPtr<SharedTask<...()>>, where ... is the type returned from the iterator. We know it's done when it returns a falsish version of ... (in the current code, that's always a pointer type, so done is indicated by null). * API/JSMarkingConstraintPrivate.cpp: (JSContextGroupAddMarkingConstraint): * API/JSVirtualMachine.mm: (scanExternalObjectGraph): (scanExternalRememberedSet): * JavaScriptCore.xcodeproj/project.pbxproj: * Sources.txt: * bytecode/AccessCase.cpp: (JSC::AccessCase::propagateTransitions const): * bytecode/CodeBlock.cpp: (JSC::CodeBlock::visitWeakly): (JSC::CodeBlock::shouldJettisonDueToOldAge): (JSC::shouldMarkTransition): (JSC::CodeBlock::propagateTransitions): (JSC::CodeBlock::determineLiveness): * dfg/DFGWorklist.cpp: * ftl/FTLCompile.cpp: (JSC::FTL::compile): * heap/ConstraintParallelism.h: Added. (WTF::printInternal): * heap/Heap.cpp: (JSC::Heap::Heap): (JSC::Heap::addToRememberedSet): (JSC::Heap::runFixpointPhase): (JSC::Heap::stopThePeriphery): (JSC::Heap::resumeThePeriphery): (JSC::Heap::addCoreConstraints): (JSC::Heap::setBonusVisitorTask): (JSC::Heap::runTaskInParallel): (JSC::Heap::forEachSlotVisitor): Deleted. * heap/Heap.h: (JSC::Heap::worldIsRunning const): (JSC::Heap::runFunctionInParallel): * heap/HeapInlines.h: (JSC::Heap::worldIsStopped const): (JSC::Heap::isMarked): (JSC::Heap::incrementDeferralDepth): (JSC::Heap::decrementDeferralDepth): (JSC::Heap::decrementDeferralDepthAndGCIfNeeded): (JSC::Heap::forEachSlotVisitor): (JSC::Heap::collectorBelievesThatTheWorldIsStopped const): Deleted. (JSC::Heap::isMarkedConcurrently): Deleted. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::appendNode): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): Deleted. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::parallelNotEmptyBlockSource): * heap/MarkedAllocator.h: * heap/MarkedBlock.h: (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::areMarksStaleWithDependency): Deleted. (JSC::MarkedBlock::isMarkedConcurrently): Deleted. * heap/MarkedSpace.h: (JSC::MarkedSpace::activeWeakSetsBegin): (JSC::MarkedSpace::activeWeakSetsEnd): (JSC::MarkedSpace::newActiveWeakSetsBegin): (JSC::MarkedSpace::newActiveWeakSetsEnd): * heap/MarkingConstraint.cpp: (JSC::MarkingConstraint::MarkingConstraint): (JSC::MarkingConstraint::execute): (JSC::MarkingConstraint::quickWorkEstimate): (JSC::MarkingConstraint::workEstimate): (JSC::MarkingConstraint::doParallelWork): (JSC::MarkingConstraint::finishParallelWork): (JSC::MarkingConstraint::doParallelWorkImpl): (JSC::MarkingConstraint::finishParallelWorkImpl): * heap/MarkingConstraint.h: (JSC::MarkingConstraint::lastExecuteParallelism const): (JSC::MarkingConstraint::parallelism const): (JSC::MarkingConstraint::quickWorkEstimate): Deleted. (JSC::MarkingConstraint::workEstimate): Deleted. * heap/MarkingConstraintSet.cpp: (JSC::MarkingConstraintSet::MarkingConstraintSet): (JSC::MarkingConstraintSet::add): (JSC::MarkingConstraintSet::executeConvergence): (JSC::MarkingConstraintSet::executeConvergenceImpl): (JSC::MarkingConstraintSet::executeAll): (JSC::MarkingConstraintSet::ExecutionContext::ExecutionContext): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didVisitSomething const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::shouldTimeOut const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::drain): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didExecute const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::execute): Deleted. (): Deleted. * heap/MarkingConstraintSet.h: * heap/MarkingConstraintSolver.cpp: Added. (JSC::MarkingConstraintSolver::MarkingConstraintSolver): (JSC::MarkingConstraintSolver::~MarkingConstraintSolver): (JSC::MarkingConstraintSolver::didVisitSomething const): (JSC::MarkingConstraintSolver::execute): (JSC::MarkingConstraintSolver::drain): (JSC::MarkingConstraintSolver::converge): (JSC::MarkingConstraintSolver::runExecutionThread): (JSC::MarkingConstraintSolver::didExecute): * heap/MarkingConstraintSolver.h: Added. * heap/OpaqueRootSet.h: Removed. * heap/ParallelSourceAdapter.h: Added. (JSC::ParallelSourceAdapter::ParallelSourceAdapter): (JSC::createParallelSourceAdapter): * heap/SimpleMarkingConstraint.cpp: Added. (JSC::SimpleMarkingConstraint::SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::~SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::quickWorkEstimate): (JSC::SimpleMarkingConstraint::executeImpl): * heap/SimpleMarkingConstraint.h: Added. * heap/SlotVisitor.cpp: (JSC::SlotVisitor::didStartMarking): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::updateMutatorIsStopped): (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate const): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::performIncrementOfDraining): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): (JSC::SlotVisitor::waitForTermination): (JSC::SlotVisitor::addOpaqueRoot): Deleted. (JSC::SlotVisitor::containsOpaqueRoot const): Deleted. (JSC::SlotVisitor::containsOpaqueRootTriState const): Deleted. (JSC::SlotVisitor::mergeIfNecessary): Deleted. (JSC::SlotVisitor::mergeOpaqueRootsIfProfitable): Deleted. (JSC::SlotVisitor::mergeOpaqueRoots): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::addOpaqueRoot): (JSC::SlotVisitor::containsOpaqueRoot const): (JSC::SlotVisitor::vm): (JSC::SlotVisitor::vm const): * heap/Subspace.cpp: (JSC::Subspace::parallelAllocatorSource): (JSC::Subspace::parallelNotEmptyMarkedBlockSource): * heap/Subspace.h: * heap/SubspaceInlines.h: (JSC::Subspace::forEachMarkedCellInParallel): * heap/VisitCounter.h: Added. (JSC::VisitCounter::VisitCounter): (JSC::VisitCounter::visitCount const): * heap/VisitingTimeout.h: Removed. * heap/WeakBlock.cpp: (JSC::WeakBlock::specializedVisit): * runtime/Structure.cpp: (JSC::Structure::isCheapDuringGC): (JSC::Structure::markIfCheap): Source/WebCore: No new tests because no change in behavior. This change is best tested using DOM-GC-intensive benchmarks like Speedometer and Dromaeo. This parallelizes the DOM's output constraint, and makes some small changes to make this more scalable. * ForwardingHeaders/heap/SimpleMarkingConstraint.h: Added. * ForwardingHeaders/heap/VisitingTimeout.h: Removed. * Sources.txt: * WebCore.xcodeproj/project.pbxproj: * bindings/js/DOMGCOutputConstraint.cpp: Added. (WebCore::DOMGCOutputConstraint::DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::~DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::executeImpl): (WebCore::DOMGCOutputConstraint::doParallelWorkImpl): (WebCore::DOMGCOutputConstraint::finishParallelWorkImpl): * bindings/js/DOMGCOutputConstraint.h: Added. * bindings/js/WebCoreJSClientData.cpp: (WebCore::JSVMClientData::initNormalWorld): * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): Source/WTF: This does some changes to make it easier to do parallel constraint solving: - I finally removed dependencyWith. This was a silly construct whose only purpose is to confuse people about what it means to have a dependency chain. I took that as an opportunity to grealy simplify the GC's use of dependency chaining. - Added more logic to Deque<>, since I use it for part of the load balancer. - Made it possible to profile lock contention. See https://bugs.webkit.org/show_bug.cgi?id=180250#c0 for some preliminary measurements. - Introduced holdLockIf, which makes it easy to perform predicated lock acquisition. We use that to pick a lock in WebCore. - Introduced CountingLock. It's like WTF::Lock except it also enables optimistic read transactions sorta like Java's StampedLock. * WTF.xcodeproj/project.pbxproj: * wtf/Atomics.h: (WTF::dependency): (WTF::DependencyWith::DependencyWith): Deleted. (WTF::dependencyWith): Deleted. * wtf/BitVector.h: (WTF::BitVector::iterator::operator++): * wtf/CMakeLists.txt: * wtf/ConcurrentPtrHashSet.cpp: Added. (WTF::ConcurrentPtrHashSet::ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::~ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::deleteOldTables): (WTF::ConcurrentPtrHashSet::clear): (WTF::ConcurrentPtrHashSet::initialize): (WTF::ConcurrentPtrHashSet::addSlow): (WTF::ConcurrentPtrHashSet::resizeIfNecessary): (WTF::ConcurrentPtrHashSet::resizeAndAdd): (WTF::ConcurrentPtrHashSet::Table::create): * wtf/ConcurrentPtrHashSet.h: Added. (WTF::ConcurrentPtrHashSet::contains): (WTF::ConcurrentPtrHashSet::add): (WTF::ConcurrentPtrHashSet::size const): (WTF::ConcurrentPtrHashSet::Table::maxLoad const): (WTF::ConcurrentPtrHashSet::hash): (WTF::ConcurrentPtrHashSet::cast): (WTF::ConcurrentPtrHashSet::containsImpl const): (WTF::ConcurrentPtrHashSet::addImpl): * wtf/Deque.h: (WTF::inlineCapacity>::takeFirst): * wtf/FastMalloc.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): * wtf/Locker.h: (WTF::holdLockIf): * wtf/ScopedLambda.h: * wtf/SharedTask.h: (WTF::SharedTask<PassedResultType): (WTF::SharedTask<ResultType): Deleted. * wtf/StackShot.h: Added. (WTF::StackShot::StackShot): (WTF::StackShot::operator=): (WTF::StackShot::array const): (WTF::StackShot::size const): (WTF::StackShot::operator bool const): (WTF::StackShot::operator== const): (WTF::StackShot::hash const): (WTF::StackShot::isHashTableDeletedValue const): (WTF::StackShot::operator> const): (WTF::StackShot::deletedValueArray): (WTF::StackShotHash::hash): (WTF::StackShotHash::equal): * wtf/StackShotProfiler.h: Added. (WTF::StackShotProfiler::StackShotProfiler): (WTF::StackShotProfiler::profile): (WTF::StackShotProfiler::run): Tools: * Scripts/run-jsc-benchmarks: Add splay-latency test, since this change needed to be carefully validated with that benchmark. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/ConcurrentPtrHashSet.cpp: Added. This has unit tests of the new concurrent data structure. The tests focus on correctness under serial execution, which appears to be enough for now (it's so easy to catch a concurrency bug by just running the GC). (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/196360@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225524 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-05 17:53:57 +00:00
if (!(currentValue & isHeldBit)) {
if (lock.compareExchangeWeak(currentValue, Hooks::lockHook(currentValue | isHeldBit)))
return;
continue;
}
It should be easy to decide how WebKit yields https://bugs.webkit.org/show_bug.cgi?id=174298 Reviewed by Saam Barati. Source/bmalloc: Use sched_yield() explicitly. * bmalloc/StaticMutex.cpp: (bmalloc::StaticMutex::lockSlowCase): Source/JavaScriptCore: Use the new WTF::Thread::yield() function for yielding instead of the C++ function. * heap/Heap.cpp: (JSC::Heap::resumeThePeriphery): * heap/VisitingTimeout.h: * runtime/JSCell.cpp: (JSC::JSCell::lockSlow): (JSC::JSCell::unlockSlow): * runtime/JSCell.h: * runtime/JSCellInlines.h: (JSC::JSCell::lock): (JSC::JSCell::unlock): * runtime/JSLock.cpp: (JSC::JSLock::grabAllLocks): * runtime/SamplingProfiler.cpp: Source/WebCore: No new tests because the WebCore change is just a change to how we #include things. * inspector/InspectorPageAgent.h: * inspector/TimelineRecordFactory.h: * workers/Worker.h: * workers/WorkerGlobalScopeProxy.h: * workers/WorkerMessagingProxy.h: Source/WebKitLegacy: * Storage/StorageTracker.h: Source/WTF: Created a Thread::yield() abstraction for sched_yield(), and made WTF use it everywhere that it had previously used std::this_thread::yield(). To make it less annoying to experiment with changes to the lock algorithm in the future, this also moves the meat of the algorithm into LockAlgorithmInlines.h. Only two files include that header. Since LockAlgorithm.h no longer includes ParkingLot.h, a bunch of files in WK now need to include timing headers (Seconds, MonotonicTime, etc) manually. * WTF.xcodeproj/project.pbxproj: * benchmarks/ToyLocks.h: * wtf/CMakeLists.txt: * wtf/Lock.cpp: * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockSlow): Deleted. (WTF::LockAlgorithm::unlockSlow): Deleted. * wtf/LockAlgorithmInlines.h: Added. (WTF::hasParkedBit>::lockSlow): (WTF::hasParkedBit>::unlockSlow): * wtf/MainThread.cpp: * wtf/RunLoopTimer.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPthreads.cpp: (WTF::Thread::yield): * wtf/ThreadingWin.cpp: (WTF::Thread::yield): * wtf/WordLock.cpp: (WTF::WordLockBase::lockSlow): (WTF::WordLockBase::unlockSlow): Canonical link: https://commits.webkit.org/191572@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@219763 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-07-22 14:36:18 +00:00
// If there is nobody parked and we haven't spun too much, we can just try to spin around.
GC constraint solving should be parallel https://bugs.webkit.org/show_bug.cgi?id=179934 Reviewed by JF Bastien. PerformanceTests: Added a version of splay that measures latency in a way that run-jsc-benchmarks groks. * Octane/splay.js: Added. (this.Setup.setup.setup): (this.TearDown.tearDown.tearDown): (Benchmark): (BenchmarkResult): (BenchmarkResult.prototype.valueOf): (BenchmarkSuite): (alert): (Math.random): (BenchmarkSuite.ResetRNG): (RunStep): (BenchmarkSuite.RunSuites): (BenchmarkSuite.CountBenchmarks): (BenchmarkSuite.GeometricMean): (BenchmarkSuite.GeometricMeanTime): (BenchmarkSuite.AverageAbovePercentile): (BenchmarkSuite.GeometricMeanLatency): (BenchmarkSuite.FormatScore): (BenchmarkSuite.prototype.NotifyStep): (BenchmarkSuite.prototype.NotifyResult): (BenchmarkSuite.prototype.NotifyError): (BenchmarkSuite.prototype.RunSingleBenchmark): (RunNextSetup): (RunNextBenchmark): (RunNextTearDown): (BenchmarkSuite.prototype.RunStep): (GeneratePayloadTree): (GenerateKey): (SplayUpdateStats): (InsertNewNode): (SplaySetup): (SplayTearDown): (SplayRun): (SplayTree): (SplayTree.prototype.isEmpty): (SplayTree.prototype.insert): (SplayTree.prototype.remove): (SplayTree.prototype.find): (SplayTree.prototype.findMax): (SplayTree.prototype.findGreatestLessThan): (SplayTree.prototype.exportKeys): (SplayTree.prototype.splay_): (SplayTree.Node): (SplayTree.Node.prototype.traverse_): (report): (start): Source/JavaScriptCore: This makes it possible to do constraint solving in parallel. This looks like a 1% Speedometer speed-up. It's more than 1% on trunk-Speedometer. The constraint solver supports running constraints in parallel in two different ways: - Run multiple constraints in parallel to each other. This only works for constraints that can tolerate other constraints running concurrently to them (constraint.concurrency() == ConstraintConcurrency::Concurrent). This is the most basic kind of parallelism that the constraint solver supports. All constraints except the JSC SPI constraints are concurrent. We could probably make them concurrent, but I'm playing it safe for now. - A constraint can create parallel work for itself, which the constraint solver will interleave with other stuff. A constraint can report that it has parallel work by returning ConstraintParallelism::Parallel from its executeImpl() function. Then the solver will allow that constraint's doParallelWorkImpl() function to run on as many GC marker threads as are available, for as long as that function wants to run. It's not possible to have a non-concurrent constraint that creates parallel work. The parallelism is implemented in terms of the existing GC marker threads. This turns out to be most natural for two reasons: - No need to start any other threads. - The constraints all want to be passed a SlotVisitor. Running on the marker threads means having access to those threads' SlotVisitors. Also, it means less load balancing. The solver will create work on each marking thread's SlotVisitor. When the solver is done "stealing" a marker thread, that thread will have work it can start doing immediately. Before this change, we had to contribute the work found by the constraint solver to the global worklist so that it could be distributed to the marker threads by load balancing. This change probably helps to avoid that load balancing step. A lot of this change is about making it easy to iterate GC data structures in parallel. This change makes almost all constraints parallel-enabled, but only the DOM's output constraint uses the parallel work API. That constraint iterates the marked cells in two subspaces. This change makes it very easy to compose parallel iterators over subspaces, allocators, blocks, and cells. The marked cell parallel iterator is composed out of parallel iterators for the others. A parallel iterator is just an iterator that can do an atomic next() very quickly. We abstract them using RefPtr<SharedTask<...()>>, where ... is the type returned from the iterator. We know it's done when it returns a falsish version of ... (in the current code, that's always a pointer type, so done is indicated by null). * API/JSMarkingConstraintPrivate.cpp: (JSContextGroupAddMarkingConstraint): * API/JSVirtualMachine.mm: (scanExternalObjectGraph): (scanExternalRememberedSet): * JavaScriptCore.xcodeproj/project.pbxproj: * Sources.txt: * bytecode/AccessCase.cpp: (JSC::AccessCase::propagateTransitions const): * bytecode/CodeBlock.cpp: (JSC::CodeBlock::visitWeakly): (JSC::CodeBlock::shouldJettisonDueToOldAge): (JSC::shouldMarkTransition): (JSC::CodeBlock::propagateTransitions): (JSC::CodeBlock::determineLiveness): * dfg/DFGWorklist.cpp: * ftl/FTLCompile.cpp: (JSC::FTL::compile): * heap/ConstraintParallelism.h: Added. (WTF::printInternal): * heap/Heap.cpp: (JSC::Heap::Heap): (JSC::Heap::addToRememberedSet): (JSC::Heap::runFixpointPhase): (JSC::Heap::stopThePeriphery): (JSC::Heap::resumeThePeriphery): (JSC::Heap::addCoreConstraints): (JSC::Heap::setBonusVisitorTask): (JSC::Heap::runTaskInParallel): (JSC::Heap::forEachSlotVisitor): Deleted. * heap/Heap.h: (JSC::Heap::worldIsRunning const): (JSC::Heap::runFunctionInParallel): * heap/HeapInlines.h: (JSC::Heap::worldIsStopped const): (JSC::Heap::isMarked): (JSC::Heap::incrementDeferralDepth): (JSC::Heap::decrementDeferralDepth): (JSC::Heap::decrementDeferralDepthAndGCIfNeeded): (JSC::Heap::forEachSlotVisitor): (JSC::Heap::collectorBelievesThatTheWorldIsStopped const): Deleted. (JSC::Heap::isMarkedConcurrently): Deleted. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::appendNode): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): Deleted. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::parallelNotEmptyBlockSource): * heap/MarkedAllocator.h: * heap/MarkedBlock.h: (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::areMarksStaleWithDependency): Deleted. (JSC::MarkedBlock::isMarkedConcurrently): Deleted. * heap/MarkedSpace.h: (JSC::MarkedSpace::activeWeakSetsBegin): (JSC::MarkedSpace::activeWeakSetsEnd): (JSC::MarkedSpace::newActiveWeakSetsBegin): (JSC::MarkedSpace::newActiveWeakSetsEnd): * heap/MarkingConstraint.cpp: (JSC::MarkingConstraint::MarkingConstraint): (JSC::MarkingConstraint::execute): (JSC::MarkingConstraint::quickWorkEstimate): (JSC::MarkingConstraint::workEstimate): (JSC::MarkingConstraint::doParallelWork): (JSC::MarkingConstraint::finishParallelWork): (JSC::MarkingConstraint::doParallelWorkImpl): (JSC::MarkingConstraint::finishParallelWorkImpl): * heap/MarkingConstraint.h: (JSC::MarkingConstraint::lastExecuteParallelism const): (JSC::MarkingConstraint::parallelism const): (JSC::MarkingConstraint::quickWorkEstimate): Deleted. (JSC::MarkingConstraint::workEstimate): Deleted. * heap/MarkingConstraintSet.cpp: (JSC::MarkingConstraintSet::MarkingConstraintSet): (JSC::MarkingConstraintSet::add): (JSC::MarkingConstraintSet::executeConvergence): (JSC::MarkingConstraintSet::executeConvergenceImpl): (JSC::MarkingConstraintSet::executeAll): (JSC::MarkingConstraintSet::ExecutionContext::ExecutionContext): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didVisitSomething const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::shouldTimeOut const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::drain): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didExecute const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::execute): Deleted. (): Deleted. * heap/MarkingConstraintSet.h: * heap/MarkingConstraintSolver.cpp: Added. (JSC::MarkingConstraintSolver::MarkingConstraintSolver): (JSC::MarkingConstraintSolver::~MarkingConstraintSolver): (JSC::MarkingConstraintSolver::didVisitSomething const): (JSC::MarkingConstraintSolver::execute): (JSC::MarkingConstraintSolver::drain): (JSC::MarkingConstraintSolver::converge): (JSC::MarkingConstraintSolver::runExecutionThread): (JSC::MarkingConstraintSolver::didExecute): * heap/MarkingConstraintSolver.h: Added. * heap/OpaqueRootSet.h: Removed. * heap/ParallelSourceAdapter.h: Added. (JSC::ParallelSourceAdapter::ParallelSourceAdapter): (JSC::createParallelSourceAdapter): * heap/SimpleMarkingConstraint.cpp: Added. (JSC::SimpleMarkingConstraint::SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::~SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::quickWorkEstimate): (JSC::SimpleMarkingConstraint::executeImpl): * heap/SimpleMarkingConstraint.h: Added. * heap/SlotVisitor.cpp: (JSC::SlotVisitor::didStartMarking): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::updateMutatorIsStopped): (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate const): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::performIncrementOfDraining): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): (JSC::SlotVisitor::waitForTermination): (JSC::SlotVisitor::addOpaqueRoot): Deleted. (JSC::SlotVisitor::containsOpaqueRoot const): Deleted. (JSC::SlotVisitor::containsOpaqueRootTriState const): Deleted. (JSC::SlotVisitor::mergeIfNecessary): Deleted. (JSC::SlotVisitor::mergeOpaqueRootsIfProfitable): Deleted. (JSC::SlotVisitor::mergeOpaqueRoots): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::addOpaqueRoot): (JSC::SlotVisitor::containsOpaqueRoot const): (JSC::SlotVisitor::vm): (JSC::SlotVisitor::vm const): * heap/Subspace.cpp: (JSC::Subspace::parallelAllocatorSource): (JSC::Subspace::parallelNotEmptyMarkedBlockSource): * heap/Subspace.h: * heap/SubspaceInlines.h: (JSC::Subspace::forEachMarkedCellInParallel): * heap/VisitCounter.h: Added. (JSC::VisitCounter::VisitCounter): (JSC::VisitCounter::visitCount const): * heap/VisitingTimeout.h: Removed. * heap/WeakBlock.cpp: (JSC::WeakBlock::specializedVisit): * runtime/Structure.cpp: (JSC::Structure::isCheapDuringGC): (JSC::Structure::markIfCheap): Source/WebCore: No new tests because no change in behavior. This change is best tested using DOM-GC-intensive benchmarks like Speedometer and Dromaeo. This parallelizes the DOM's output constraint, and makes some small changes to make this more scalable. * ForwardingHeaders/heap/SimpleMarkingConstraint.h: Added. * ForwardingHeaders/heap/VisitingTimeout.h: Removed. * Sources.txt: * WebCore.xcodeproj/project.pbxproj: * bindings/js/DOMGCOutputConstraint.cpp: Added. (WebCore::DOMGCOutputConstraint::DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::~DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::executeImpl): (WebCore::DOMGCOutputConstraint::doParallelWorkImpl): (WebCore::DOMGCOutputConstraint::finishParallelWorkImpl): * bindings/js/DOMGCOutputConstraint.h: Added. * bindings/js/WebCoreJSClientData.cpp: (WebCore::JSVMClientData::initNormalWorld): * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): Source/WTF: This does some changes to make it easier to do parallel constraint solving: - I finally removed dependencyWith. This was a silly construct whose only purpose is to confuse people about what it means to have a dependency chain. I took that as an opportunity to grealy simplify the GC's use of dependency chaining. - Added more logic to Deque<>, since I use it for part of the load balancer. - Made it possible to profile lock contention. See https://bugs.webkit.org/show_bug.cgi?id=180250#c0 for some preliminary measurements. - Introduced holdLockIf, which makes it easy to perform predicated lock acquisition. We use that to pick a lock in WebCore. - Introduced CountingLock. It's like WTF::Lock except it also enables optimistic read transactions sorta like Java's StampedLock. * WTF.xcodeproj/project.pbxproj: * wtf/Atomics.h: (WTF::dependency): (WTF::DependencyWith::DependencyWith): Deleted. (WTF::dependencyWith): Deleted. * wtf/BitVector.h: (WTF::BitVector::iterator::operator++): * wtf/CMakeLists.txt: * wtf/ConcurrentPtrHashSet.cpp: Added. (WTF::ConcurrentPtrHashSet::ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::~ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::deleteOldTables): (WTF::ConcurrentPtrHashSet::clear): (WTF::ConcurrentPtrHashSet::initialize): (WTF::ConcurrentPtrHashSet::addSlow): (WTF::ConcurrentPtrHashSet::resizeIfNecessary): (WTF::ConcurrentPtrHashSet::resizeAndAdd): (WTF::ConcurrentPtrHashSet::Table::create): * wtf/ConcurrentPtrHashSet.h: Added. (WTF::ConcurrentPtrHashSet::contains): (WTF::ConcurrentPtrHashSet::add): (WTF::ConcurrentPtrHashSet::size const): (WTF::ConcurrentPtrHashSet::Table::maxLoad const): (WTF::ConcurrentPtrHashSet::hash): (WTF::ConcurrentPtrHashSet::cast): (WTF::ConcurrentPtrHashSet::containsImpl const): (WTF::ConcurrentPtrHashSet::addImpl): * wtf/Deque.h: (WTF::inlineCapacity>::takeFirst): * wtf/FastMalloc.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): * wtf/Locker.h: (WTF::holdLockIf): * wtf/ScopedLambda.h: * wtf/SharedTask.h: (WTF::SharedTask<PassedResultType): (WTF::SharedTask<ResultType): Deleted. * wtf/StackShot.h: Added. (WTF::StackShot::StackShot): (WTF::StackShot::operator=): (WTF::StackShot::array const): (WTF::StackShot::size const): (WTF::StackShot::operator bool const): (WTF::StackShot::operator== const): (WTF::StackShot::hash const): (WTF::StackShot::isHashTableDeletedValue const): (WTF::StackShot::operator> const): (WTF::StackShot::deletedValueArray): (WTF::StackShotHash::hash): (WTF::StackShotHash::equal): * wtf/StackShotProfiler.h: Added. (WTF::StackShotProfiler::StackShotProfiler): (WTF::StackShotProfiler::profile): (WTF::StackShotProfiler::run): Tools: * Scripts/run-jsc-benchmarks: Add splay-latency test, since this change needed to be carefully validated with that benchmark. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/ConcurrentPtrHashSet.cpp: Added. This has unit tests of the new concurrent data structure. The tests focus on correctness under serial execution, which appears to be enough for now (it's so easy to catch a concurrency bug by just running the GC). (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/196360@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225524 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-05 17:53:57 +00:00
if (!(currentValue & hasParkedBit) && spinCount < spinLimit) {
It should be easy to decide how WebKit yields https://bugs.webkit.org/show_bug.cgi?id=174298 Reviewed by Saam Barati. Source/bmalloc: Use sched_yield() explicitly. * bmalloc/StaticMutex.cpp: (bmalloc::StaticMutex::lockSlowCase): Source/JavaScriptCore: Use the new WTF::Thread::yield() function for yielding instead of the C++ function. * heap/Heap.cpp: (JSC::Heap::resumeThePeriphery): * heap/VisitingTimeout.h: * runtime/JSCell.cpp: (JSC::JSCell::lockSlow): (JSC::JSCell::unlockSlow): * runtime/JSCell.h: * runtime/JSCellInlines.h: (JSC::JSCell::lock): (JSC::JSCell::unlock): * runtime/JSLock.cpp: (JSC::JSLock::grabAllLocks): * runtime/SamplingProfiler.cpp: Source/WebCore: No new tests because the WebCore change is just a change to how we #include things. * inspector/InspectorPageAgent.h: * inspector/TimelineRecordFactory.h: * workers/Worker.h: * workers/WorkerGlobalScopeProxy.h: * workers/WorkerMessagingProxy.h: Source/WebKitLegacy: * Storage/StorageTracker.h: Source/WTF: Created a Thread::yield() abstraction for sched_yield(), and made WTF use it everywhere that it had previously used std::this_thread::yield(). To make it less annoying to experiment with changes to the lock algorithm in the future, this also moves the meat of the algorithm into LockAlgorithmInlines.h. Only two files include that header. Since LockAlgorithm.h no longer includes ParkingLot.h, a bunch of files in WK now need to include timing headers (Seconds, MonotonicTime, etc) manually. * WTF.xcodeproj/project.pbxproj: * benchmarks/ToyLocks.h: * wtf/CMakeLists.txt: * wtf/Lock.cpp: * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockSlow): Deleted. (WTF::LockAlgorithm::unlockSlow): Deleted. * wtf/LockAlgorithmInlines.h: Added. (WTF::hasParkedBit>::lockSlow): (WTF::hasParkedBit>::unlockSlow): * wtf/MainThread.cpp: * wtf/RunLoopTimer.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPthreads.cpp: (WTF::Thread::yield): * wtf/ThreadingWin.cpp: (WTF::Thread::yield): * wtf/WordLock.cpp: (WTF::WordLockBase::lockSlow): (WTF::WordLockBase::unlockSlow): Canonical link: https://commits.webkit.org/191572@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@219763 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-07-22 14:36:18 +00:00
spinCount++;
Thread::yield();
continue;
}
// Need to park. We do this by setting the parked bit first, and then parking. We spin around
// if the parked bit wasn't set and we failed at setting it.
GC constraint solving should be parallel https://bugs.webkit.org/show_bug.cgi?id=179934 Reviewed by JF Bastien. PerformanceTests: Added a version of splay that measures latency in a way that run-jsc-benchmarks groks. * Octane/splay.js: Added. (this.Setup.setup.setup): (this.TearDown.tearDown.tearDown): (Benchmark): (BenchmarkResult): (BenchmarkResult.prototype.valueOf): (BenchmarkSuite): (alert): (Math.random): (BenchmarkSuite.ResetRNG): (RunStep): (BenchmarkSuite.RunSuites): (BenchmarkSuite.CountBenchmarks): (BenchmarkSuite.GeometricMean): (BenchmarkSuite.GeometricMeanTime): (BenchmarkSuite.AverageAbovePercentile): (BenchmarkSuite.GeometricMeanLatency): (BenchmarkSuite.FormatScore): (BenchmarkSuite.prototype.NotifyStep): (BenchmarkSuite.prototype.NotifyResult): (BenchmarkSuite.prototype.NotifyError): (BenchmarkSuite.prototype.RunSingleBenchmark): (RunNextSetup): (RunNextBenchmark): (RunNextTearDown): (BenchmarkSuite.prototype.RunStep): (GeneratePayloadTree): (GenerateKey): (SplayUpdateStats): (InsertNewNode): (SplaySetup): (SplayTearDown): (SplayRun): (SplayTree): (SplayTree.prototype.isEmpty): (SplayTree.prototype.insert): (SplayTree.prototype.remove): (SplayTree.prototype.find): (SplayTree.prototype.findMax): (SplayTree.prototype.findGreatestLessThan): (SplayTree.prototype.exportKeys): (SplayTree.prototype.splay_): (SplayTree.Node): (SplayTree.Node.prototype.traverse_): (report): (start): Source/JavaScriptCore: This makes it possible to do constraint solving in parallel. This looks like a 1% Speedometer speed-up. It's more than 1% on trunk-Speedometer. The constraint solver supports running constraints in parallel in two different ways: - Run multiple constraints in parallel to each other. This only works for constraints that can tolerate other constraints running concurrently to them (constraint.concurrency() == ConstraintConcurrency::Concurrent). This is the most basic kind of parallelism that the constraint solver supports. All constraints except the JSC SPI constraints are concurrent. We could probably make them concurrent, but I'm playing it safe for now. - A constraint can create parallel work for itself, which the constraint solver will interleave with other stuff. A constraint can report that it has parallel work by returning ConstraintParallelism::Parallel from its executeImpl() function. Then the solver will allow that constraint's doParallelWorkImpl() function to run on as many GC marker threads as are available, for as long as that function wants to run. It's not possible to have a non-concurrent constraint that creates parallel work. The parallelism is implemented in terms of the existing GC marker threads. This turns out to be most natural for two reasons: - No need to start any other threads. - The constraints all want to be passed a SlotVisitor. Running on the marker threads means having access to those threads' SlotVisitors. Also, it means less load balancing. The solver will create work on each marking thread's SlotVisitor. When the solver is done "stealing" a marker thread, that thread will have work it can start doing immediately. Before this change, we had to contribute the work found by the constraint solver to the global worklist so that it could be distributed to the marker threads by load balancing. This change probably helps to avoid that load balancing step. A lot of this change is about making it easy to iterate GC data structures in parallel. This change makes almost all constraints parallel-enabled, but only the DOM's output constraint uses the parallel work API. That constraint iterates the marked cells in two subspaces. This change makes it very easy to compose parallel iterators over subspaces, allocators, blocks, and cells. The marked cell parallel iterator is composed out of parallel iterators for the others. A parallel iterator is just an iterator that can do an atomic next() very quickly. We abstract them using RefPtr<SharedTask<...()>>, where ... is the type returned from the iterator. We know it's done when it returns a falsish version of ... (in the current code, that's always a pointer type, so done is indicated by null). * API/JSMarkingConstraintPrivate.cpp: (JSContextGroupAddMarkingConstraint): * API/JSVirtualMachine.mm: (scanExternalObjectGraph): (scanExternalRememberedSet): * JavaScriptCore.xcodeproj/project.pbxproj: * Sources.txt: * bytecode/AccessCase.cpp: (JSC::AccessCase::propagateTransitions const): * bytecode/CodeBlock.cpp: (JSC::CodeBlock::visitWeakly): (JSC::CodeBlock::shouldJettisonDueToOldAge): (JSC::shouldMarkTransition): (JSC::CodeBlock::propagateTransitions): (JSC::CodeBlock::determineLiveness): * dfg/DFGWorklist.cpp: * ftl/FTLCompile.cpp: (JSC::FTL::compile): * heap/ConstraintParallelism.h: Added. (WTF::printInternal): * heap/Heap.cpp: (JSC::Heap::Heap): (JSC::Heap::addToRememberedSet): (JSC::Heap::runFixpointPhase): (JSC::Heap::stopThePeriphery): (JSC::Heap::resumeThePeriphery): (JSC::Heap::addCoreConstraints): (JSC::Heap::setBonusVisitorTask): (JSC::Heap::runTaskInParallel): (JSC::Heap::forEachSlotVisitor): Deleted. * heap/Heap.h: (JSC::Heap::worldIsRunning const): (JSC::Heap::runFunctionInParallel): * heap/HeapInlines.h: (JSC::Heap::worldIsStopped const): (JSC::Heap::isMarked): (JSC::Heap::incrementDeferralDepth): (JSC::Heap::decrementDeferralDepth): (JSC::Heap::decrementDeferralDepthAndGCIfNeeded): (JSC::Heap::forEachSlotVisitor): (JSC::Heap::collectorBelievesThatTheWorldIsStopped const): Deleted. (JSC::Heap::isMarkedConcurrently): Deleted. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::appendNode): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): Deleted. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::parallelNotEmptyBlockSource): * heap/MarkedAllocator.h: * heap/MarkedBlock.h: (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::areMarksStaleWithDependency): Deleted. (JSC::MarkedBlock::isMarkedConcurrently): Deleted. * heap/MarkedSpace.h: (JSC::MarkedSpace::activeWeakSetsBegin): (JSC::MarkedSpace::activeWeakSetsEnd): (JSC::MarkedSpace::newActiveWeakSetsBegin): (JSC::MarkedSpace::newActiveWeakSetsEnd): * heap/MarkingConstraint.cpp: (JSC::MarkingConstraint::MarkingConstraint): (JSC::MarkingConstraint::execute): (JSC::MarkingConstraint::quickWorkEstimate): (JSC::MarkingConstraint::workEstimate): (JSC::MarkingConstraint::doParallelWork): (JSC::MarkingConstraint::finishParallelWork): (JSC::MarkingConstraint::doParallelWorkImpl): (JSC::MarkingConstraint::finishParallelWorkImpl): * heap/MarkingConstraint.h: (JSC::MarkingConstraint::lastExecuteParallelism const): (JSC::MarkingConstraint::parallelism const): (JSC::MarkingConstraint::quickWorkEstimate): Deleted. (JSC::MarkingConstraint::workEstimate): Deleted. * heap/MarkingConstraintSet.cpp: (JSC::MarkingConstraintSet::MarkingConstraintSet): (JSC::MarkingConstraintSet::add): (JSC::MarkingConstraintSet::executeConvergence): (JSC::MarkingConstraintSet::executeConvergenceImpl): (JSC::MarkingConstraintSet::executeAll): (JSC::MarkingConstraintSet::ExecutionContext::ExecutionContext): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didVisitSomething const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::shouldTimeOut const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::drain): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didExecute const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::execute): Deleted. (): Deleted. * heap/MarkingConstraintSet.h: * heap/MarkingConstraintSolver.cpp: Added. (JSC::MarkingConstraintSolver::MarkingConstraintSolver): (JSC::MarkingConstraintSolver::~MarkingConstraintSolver): (JSC::MarkingConstraintSolver::didVisitSomething const): (JSC::MarkingConstraintSolver::execute): (JSC::MarkingConstraintSolver::drain): (JSC::MarkingConstraintSolver::converge): (JSC::MarkingConstraintSolver::runExecutionThread): (JSC::MarkingConstraintSolver::didExecute): * heap/MarkingConstraintSolver.h: Added. * heap/OpaqueRootSet.h: Removed. * heap/ParallelSourceAdapter.h: Added. (JSC::ParallelSourceAdapter::ParallelSourceAdapter): (JSC::createParallelSourceAdapter): * heap/SimpleMarkingConstraint.cpp: Added. (JSC::SimpleMarkingConstraint::SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::~SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::quickWorkEstimate): (JSC::SimpleMarkingConstraint::executeImpl): * heap/SimpleMarkingConstraint.h: Added. * heap/SlotVisitor.cpp: (JSC::SlotVisitor::didStartMarking): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::updateMutatorIsStopped): (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate const): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::performIncrementOfDraining): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): (JSC::SlotVisitor::waitForTermination): (JSC::SlotVisitor::addOpaqueRoot): Deleted. (JSC::SlotVisitor::containsOpaqueRoot const): Deleted. (JSC::SlotVisitor::containsOpaqueRootTriState const): Deleted. (JSC::SlotVisitor::mergeIfNecessary): Deleted. (JSC::SlotVisitor::mergeOpaqueRootsIfProfitable): Deleted. (JSC::SlotVisitor::mergeOpaqueRoots): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::addOpaqueRoot): (JSC::SlotVisitor::containsOpaqueRoot const): (JSC::SlotVisitor::vm): (JSC::SlotVisitor::vm const): * heap/Subspace.cpp: (JSC::Subspace::parallelAllocatorSource): (JSC::Subspace::parallelNotEmptyMarkedBlockSource): * heap/Subspace.h: * heap/SubspaceInlines.h: (JSC::Subspace::forEachMarkedCellInParallel): * heap/VisitCounter.h: Added. (JSC::VisitCounter::VisitCounter): (JSC::VisitCounter::visitCount const): * heap/VisitingTimeout.h: Removed. * heap/WeakBlock.cpp: (JSC::WeakBlock::specializedVisit): * runtime/Structure.cpp: (JSC::Structure::isCheapDuringGC): (JSC::Structure::markIfCheap): Source/WebCore: No new tests because no change in behavior. This change is best tested using DOM-GC-intensive benchmarks like Speedometer and Dromaeo. This parallelizes the DOM's output constraint, and makes some small changes to make this more scalable. * ForwardingHeaders/heap/SimpleMarkingConstraint.h: Added. * ForwardingHeaders/heap/VisitingTimeout.h: Removed. * Sources.txt: * WebCore.xcodeproj/project.pbxproj: * bindings/js/DOMGCOutputConstraint.cpp: Added. (WebCore::DOMGCOutputConstraint::DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::~DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::executeImpl): (WebCore::DOMGCOutputConstraint::doParallelWorkImpl): (WebCore::DOMGCOutputConstraint::finishParallelWorkImpl): * bindings/js/DOMGCOutputConstraint.h: Added. * bindings/js/WebCoreJSClientData.cpp: (WebCore::JSVMClientData::initNormalWorld): * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): Source/WTF: This does some changes to make it easier to do parallel constraint solving: - I finally removed dependencyWith. This was a silly construct whose only purpose is to confuse people about what it means to have a dependency chain. I took that as an opportunity to grealy simplify the GC's use of dependency chaining. - Added more logic to Deque<>, since I use it for part of the load balancer. - Made it possible to profile lock contention. See https://bugs.webkit.org/show_bug.cgi?id=180250#c0 for some preliminary measurements. - Introduced holdLockIf, which makes it easy to perform predicated lock acquisition. We use that to pick a lock in WebCore. - Introduced CountingLock. It's like WTF::Lock except it also enables optimistic read transactions sorta like Java's StampedLock. * WTF.xcodeproj/project.pbxproj: * wtf/Atomics.h: (WTF::dependency): (WTF::DependencyWith::DependencyWith): Deleted. (WTF::dependencyWith): Deleted. * wtf/BitVector.h: (WTF::BitVector::iterator::operator++): * wtf/CMakeLists.txt: * wtf/ConcurrentPtrHashSet.cpp: Added. (WTF::ConcurrentPtrHashSet::ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::~ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::deleteOldTables): (WTF::ConcurrentPtrHashSet::clear): (WTF::ConcurrentPtrHashSet::initialize): (WTF::ConcurrentPtrHashSet::addSlow): (WTF::ConcurrentPtrHashSet::resizeIfNecessary): (WTF::ConcurrentPtrHashSet::resizeAndAdd): (WTF::ConcurrentPtrHashSet::Table::create): * wtf/ConcurrentPtrHashSet.h: Added. (WTF::ConcurrentPtrHashSet::contains): (WTF::ConcurrentPtrHashSet::add): (WTF::ConcurrentPtrHashSet::size const): (WTF::ConcurrentPtrHashSet::Table::maxLoad const): (WTF::ConcurrentPtrHashSet::hash): (WTF::ConcurrentPtrHashSet::cast): (WTF::ConcurrentPtrHashSet::containsImpl const): (WTF::ConcurrentPtrHashSet::addImpl): * wtf/Deque.h: (WTF::inlineCapacity>::takeFirst): * wtf/FastMalloc.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): * wtf/Locker.h: (WTF::holdLockIf): * wtf/ScopedLambda.h: * wtf/SharedTask.h: (WTF::SharedTask<PassedResultType): (WTF::SharedTask<ResultType): Deleted. * wtf/StackShot.h: Added. (WTF::StackShot::StackShot): (WTF::StackShot::operator=): (WTF::StackShot::array const): (WTF::StackShot::size const): (WTF::StackShot::operator bool const): (WTF::StackShot::operator== const): (WTF::StackShot::hash const): (WTF::StackShot::isHashTableDeletedValue const): (WTF::StackShot::operator> const): (WTF::StackShot::deletedValueArray): (WTF::StackShotHash::hash): (WTF::StackShotHash::equal): * wtf/StackShotProfiler.h: Added. (WTF::StackShotProfiler::StackShotProfiler): (WTF::StackShotProfiler::profile): (WTF::StackShotProfiler::run): Tools: * Scripts/run-jsc-benchmarks: Add splay-latency test, since this change needed to be carefully validated with that benchmark. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/ConcurrentPtrHashSet.cpp: Added. This has unit tests of the new concurrent data structure. The tests focus on correctness under serial execution, which appears to be enough for now (it's so easy to catch a concurrency bug by just running the GC). (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/196360@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225524 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-05 17:53:57 +00:00
if (!(currentValue & hasParkedBit)) {
LockType newValue = Hooks::parkHook(currentValue | hasParkedBit);
if (!lock.compareExchangeWeak(currentValue, newValue))
continue;
currentValue = newValue;
}
if (!(currentValue & isHeldBit)) {
dataLog("Lock not held!\n");
[JSC] Use LazyNeverDestroyed & std::call_once for complex singletons https://bugs.webkit.org/show_bug.cgi?id=215153 <rdar://problem/65718983> Reviewed by Mark Lam. Source/JavaScriptCore: We are getting some crashes in RemoteInspector and this speculatively fixes the crash. My guess is that NeverDestroyed<RemoteInspector> calls constructor twice in heavily contended situation: WebKit's static does not have thread-safety. If two threads come here at the same time, it is possible that constructor is invoked twice. In that case, later constructor will clear members, which involves clearing Lock m_mutex field. This makes Lock's invariant broken. This patch uses LazyNeverDestroyed and std::call_once to ensure invoking constructor only once. * API/glib/JSCVirtualMachine.cpp: * dfg/DFGCommonData.cpp: * disassembler/Disassembler.cpp: * inspector/remote/RemoteInspector.h: * inspector/remote/cocoa/RemoteInspectorCocoa.mm: (Inspector::RemoteInspector::singleton): * inspector/remote/glib/RemoteInspectorGlib.cpp: (Inspector::RemoteInspector::singleton): * inspector/remote/socket/RemoteInspectorServer.cpp: (Inspector::RemoteInspectorServer::singleton): * inspector/remote/socket/RemoteInspectorServer.h: * inspector/remote/socket/RemoteInspectorSocket.cpp: (Inspector::RemoteInspector::singleton): * inspector/remote/socket/RemoteInspectorSocketEndpoint.cpp: (Inspector::RemoteInspectorSocketEndpoint::singleton): * interpreter/Interpreter.cpp: (JSC::Interpreter::opcodeIDTable): * runtime/IntlObject.cpp: (JSC::intlAvailableLocales): (JSC::intlCollatorAvailableLocales): (JSC::defaultLocale): (JSC::numberingSystemsForLocale): Source/WTF: Add lock's bits in crash information to investigate if this speculative fix does not work. * wtf/LockAlgorithmInlines.h: (WTF::Hooks>::lockSlow): (WTF::Hooks>::unlockSlow): Canonical link: https://commits.webkit.org/227957@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@265276 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2020-08-05 04:19:05 +00:00
CRASH_WITH_INFO(currentValue);
GC constraint solving should be parallel https://bugs.webkit.org/show_bug.cgi?id=179934 Reviewed by JF Bastien. PerformanceTests: Added a version of splay that measures latency in a way that run-jsc-benchmarks groks. * Octane/splay.js: Added. (this.Setup.setup.setup): (this.TearDown.tearDown.tearDown): (Benchmark): (BenchmarkResult): (BenchmarkResult.prototype.valueOf): (BenchmarkSuite): (alert): (Math.random): (BenchmarkSuite.ResetRNG): (RunStep): (BenchmarkSuite.RunSuites): (BenchmarkSuite.CountBenchmarks): (BenchmarkSuite.GeometricMean): (BenchmarkSuite.GeometricMeanTime): (BenchmarkSuite.AverageAbovePercentile): (BenchmarkSuite.GeometricMeanLatency): (BenchmarkSuite.FormatScore): (BenchmarkSuite.prototype.NotifyStep): (BenchmarkSuite.prototype.NotifyResult): (BenchmarkSuite.prototype.NotifyError): (BenchmarkSuite.prototype.RunSingleBenchmark): (RunNextSetup): (RunNextBenchmark): (RunNextTearDown): (BenchmarkSuite.prototype.RunStep): (GeneratePayloadTree): (GenerateKey): (SplayUpdateStats): (InsertNewNode): (SplaySetup): (SplayTearDown): (SplayRun): (SplayTree): (SplayTree.prototype.isEmpty): (SplayTree.prototype.insert): (SplayTree.prototype.remove): (SplayTree.prototype.find): (SplayTree.prototype.findMax): (SplayTree.prototype.findGreatestLessThan): (SplayTree.prototype.exportKeys): (SplayTree.prototype.splay_): (SplayTree.Node): (SplayTree.Node.prototype.traverse_): (report): (start): Source/JavaScriptCore: This makes it possible to do constraint solving in parallel. This looks like a 1% Speedometer speed-up. It's more than 1% on trunk-Speedometer. The constraint solver supports running constraints in parallel in two different ways: - Run multiple constraints in parallel to each other. This only works for constraints that can tolerate other constraints running concurrently to them (constraint.concurrency() == ConstraintConcurrency::Concurrent). This is the most basic kind of parallelism that the constraint solver supports. All constraints except the JSC SPI constraints are concurrent. We could probably make them concurrent, but I'm playing it safe for now. - A constraint can create parallel work for itself, which the constraint solver will interleave with other stuff. A constraint can report that it has parallel work by returning ConstraintParallelism::Parallel from its executeImpl() function. Then the solver will allow that constraint's doParallelWorkImpl() function to run on as many GC marker threads as are available, for as long as that function wants to run. It's not possible to have a non-concurrent constraint that creates parallel work. The parallelism is implemented in terms of the existing GC marker threads. This turns out to be most natural for two reasons: - No need to start any other threads. - The constraints all want to be passed a SlotVisitor. Running on the marker threads means having access to those threads' SlotVisitors. Also, it means less load balancing. The solver will create work on each marking thread's SlotVisitor. When the solver is done "stealing" a marker thread, that thread will have work it can start doing immediately. Before this change, we had to contribute the work found by the constraint solver to the global worklist so that it could be distributed to the marker threads by load balancing. This change probably helps to avoid that load balancing step. A lot of this change is about making it easy to iterate GC data structures in parallel. This change makes almost all constraints parallel-enabled, but only the DOM's output constraint uses the parallel work API. That constraint iterates the marked cells in two subspaces. This change makes it very easy to compose parallel iterators over subspaces, allocators, blocks, and cells. The marked cell parallel iterator is composed out of parallel iterators for the others. A parallel iterator is just an iterator that can do an atomic next() very quickly. We abstract them using RefPtr<SharedTask<...()>>, where ... is the type returned from the iterator. We know it's done when it returns a falsish version of ... (in the current code, that's always a pointer type, so done is indicated by null). * API/JSMarkingConstraintPrivate.cpp: (JSContextGroupAddMarkingConstraint): * API/JSVirtualMachine.mm: (scanExternalObjectGraph): (scanExternalRememberedSet): * JavaScriptCore.xcodeproj/project.pbxproj: * Sources.txt: * bytecode/AccessCase.cpp: (JSC::AccessCase::propagateTransitions const): * bytecode/CodeBlock.cpp: (JSC::CodeBlock::visitWeakly): (JSC::CodeBlock::shouldJettisonDueToOldAge): (JSC::shouldMarkTransition): (JSC::CodeBlock::propagateTransitions): (JSC::CodeBlock::determineLiveness): * dfg/DFGWorklist.cpp: * ftl/FTLCompile.cpp: (JSC::FTL::compile): * heap/ConstraintParallelism.h: Added. (WTF::printInternal): * heap/Heap.cpp: (JSC::Heap::Heap): (JSC::Heap::addToRememberedSet): (JSC::Heap::runFixpointPhase): (JSC::Heap::stopThePeriphery): (JSC::Heap::resumeThePeriphery): (JSC::Heap::addCoreConstraints): (JSC::Heap::setBonusVisitorTask): (JSC::Heap::runTaskInParallel): (JSC::Heap::forEachSlotVisitor): Deleted. * heap/Heap.h: (JSC::Heap::worldIsRunning const): (JSC::Heap::runFunctionInParallel): * heap/HeapInlines.h: (JSC::Heap::worldIsStopped const): (JSC::Heap::isMarked): (JSC::Heap::incrementDeferralDepth): (JSC::Heap::decrementDeferralDepth): (JSC::Heap::decrementDeferralDepthAndGCIfNeeded): (JSC::Heap::forEachSlotVisitor): (JSC::Heap::collectorBelievesThatTheWorldIsStopped const): Deleted. (JSC::Heap::isMarkedConcurrently): Deleted. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::appendNode): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): Deleted. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::parallelNotEmptyBlockSource): * heap/MarkedAllocator.h: * heap/MarkedBlock.h: (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::areMarksStaleWithDependency): Deleted. (JSC::MarkedBlock::isMarkedConcurrently): Deleted. * heap/MarkedSpace.h: (JSC::MarkedSpace::activeWeakSetsBegin): (JSC::MarkedSpace::activeWeakSetsEnd): (JSC::MarkedSpace::newActiveWeakSetsBegin): (JSC::MarkedSpace::newActiveWeakSetsEnd): * heap/MarkingConstraint.cpp: (JSC::MarkingConstraint::MarkingConstraint): (JSC::MarkingConstraint::execute): (JSC::MarkingConstraint::quickWorkEstimate): (JSC::MarkingConstraint::workEstimate): (JSC::MarkingConstraint::doParallelWork): (JSC::MarkingConstraint::finishParallelWork): (JSC::MarkingConstraint::doParallelWorkImpl): (JSC::MarkingConstraint::finishParallelWorkImpl): * heap/MarkingConstraint.h: (JSC::MarkingConstraint::lastExecuteParallelism const): (JSC::MarkingConstraint::parallelism const): (JSC::MarkingConstraint::quickWorkEstimate): Deleted. (JSC::MarkingConstraint::workEstimate): Deleted. * heap/MarkingConstraintSet.cpp: (JSC::MarkingConstraintSet::MarkingConstraintSet): (JSC::MarkingConstraintSet::add): (JSC::MarkingConstraintSet::executeConvergence): (JSC::MarkingConstraintSet::executeConvergenceImpl): (JSC::MarkingConstraintSet::executeAll): (JSC::MarkingConstraintSet::ExecutionContext::ExecutionContext): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didVisitSomething const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::shouldTimeOut const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::drain): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didExecute const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::execute): Deleted. (): Deleted. * heap/MarkingConstraintSet.h: * heap/MarkingConstraintSolver.cpp: Added. (JSC::MarkingConstraintSolver::MarkingConstraintSolver): (JSC::MarkingConstraintSolver::~MarkingConstraintSolver): (JSC::MarkingConstraintSolver::didVisitSomething const): (JSC::MarkingConstraintSolver::execute): (JSC::MarkingConstraintSolver::drain): (JSC::MarkingConstraintSolver::converge): (JSC::MarkingConstraintSolver::runExecutionThread): (JSC::MarkingConstraintSolver::didExecute): * heap/MarkingConstraintSolver.h: Added. * heap/OpaqueRootSet.h: Removed. * heap/ParallelSourceAdapter.h: Added. (JSC::ParallelSourceAdapter::ParallelSourceAdapter): (JSC::createParallelSourceAdapter): * heap/SimpleMarkingConstraint.cpp: Added. (JSC::SimpleMarkingConstraint::SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::~SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::quickWorkEstimate): (JSC::SimpleMarkingConstraint::executeImpl): * heap/SimpleMarkingConstraint.h: Added. * heap/SlotVisitor.cpp: (JSC::SlotVisitor::didStartMarking): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::updateMutatorIsStopped): (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate const): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::performIncrementOfDraining): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): (JSC::SlotVisitor::waitForTermination): (JSC::SlotVisitor::addOpaqueRoot): Deleted. (JSC::SlotVisitor::containsOpaqueRoot const): Deleted. (JSC::SlotVisitor::containsOpaqueRootTriState const): Deleted. (JSC::SlotVisitor::mergeIfNecessary): Deleted. (JSC::SlotVisitor::mergeOpaqueRootsIfProfitable): Deleted. (JSC::SlotVisitor::mergeOpaqueRoots): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::addOpaqueRoot): (JSC::SlotVisitor::containsOpaqueRoot const): (JSC::SlotVisitor::vm): (JSC::SlotVisitor::vm const): * heap/Subspace.cpp: (JSC::Subspace::parallelAllocatorSource): (JSC::Subspace::parallelNotEmptyMarkedBlockSource): * heap/Subspace.h: * heap/SubspaceInlines.h: (JSC::Subspace::forEachMarkedCellInParallel): * heap/VisitCounter.h: Added. (JSC::VisitCounter::VisitCounter): (JSC::VisitCounter::visitCount const): * heap/VisitingTimeout.h: Removed. * heap/WeakBlock.cpp: (JSC::WeakBlock::specializedVisit): * runtime/Structure.cpp: (JSC::Structure::isCheapDuringGC): (JSC::Structure::markIfCheap): Source/WebCore: No new tests because no change in behavior. This change is best tested using DOM-GC-intensive benchmarks like Speedometer and Dromaeo. This parallelizes the DOM's output constraint, and makes some small changes to make this more scalable. * ForwardingHeaders/heap/SimpleMarkingConstraint.h: Added. * ForwardingHeaders/heap/VisitingTimeout.h: Removed. * Sources.txt: * WebCore.xcodeproj/project.pbxproj: * bindings/js/DOMGCOutputConstraint.cpp: Added. (WebCore::DOMGCOutputConstraint::DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::~DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::executeImpl): (WebCore::DOMGCOutputConstraint::doParallelWorkImpl): (WebCore::DOMGCOutputConstraint::finishParallelWorkImpl): * bindings/js/DOMGCOutputConstraint.h: Added. * bindings/js/WebCoreJSClientData.cpp: (WebCore::JSVMClientData::initNormalWorld): * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): Source/WTF: This does some changes to make it easier to do parallel constraint solving: - I finally removed dependencyWith. This was a silly construct whose only purpose is to confuse people about what it means to have a dependency chain. I took that as an opportunity to grealy simplify the GC's use of dependency chaining. - Added more logic to Deque<>, since I use it for part of the load balancer. - Made it possible to profile lock contention. See https://bugs.webkit.org/show_bug.cgi?id=180250#c0 for some preliminary measurements. - Introduced holdLockIf, which makes it easy to perform predicated lock acquisition. We use that to pick a lock in WebCore. - Introduced CountingLock. It's like WTF::Lock except it also enables optimistic read transactions sorta like Java's StampedLock. * WTF.xcodeproj/project.pbxproj: * wtf/Atomics.h: (WTF::dependency): (WTF::DependencyWith::DependencyWith): Deleted. (WTF::dependencyWith): Deleted. * wtf/BitVector.h: (WTF::BitVector::iterator::operator++): * wtf/CMakeLists.txt: * wtf/ConcurrentPtrHashSet.cpp: Added. (WTF::ConcurrentPtrHashSet::ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::~ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::deleteOldTables): (WTF::ConcurrentPtrHashSet::clear): (WTF::ConcurrentPtrHashSet::initialize): (WTF::ConcurrentPtrHashSet::addSlow): (WTF::ConcurrentPtrHashSet::resizeIfNecessary): (WTF::ConcurrentPtrHashSet::resizeAndAdd): (WTF::ConcurrentPtrHashSet::Table::create): * wtf/ConcurrentPtrHashSet.h: Added. (WTF::ConcurrentPtrHashSet::contains): (WTF::ConcurrentPtrHashSet::add): (WTF::ConcurrentPtrHashSet::size const): (WTF::ConcurrentPtrHashSet::Table::maxLoad const): (WTF::ConcurrentPtrHashSet::hash): (WTF::ConcurrentPtrHashSet::cast): (WTF::ConcurrentPtrHashSet::containsImpl const): (WTF::ConcurrentPtrHashSet::addImpl): * wtf/Deque.h: (WTF::inlineCapacity>::takeFirst): * wtf/FastMalloc.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): * wtf/Locker.h: (WTF::holdLockIf): * wtf/ScopedLambda.h: * wtf/SharedTask.h: (WTF::SharedTask<PassedResultType): (WTF::SharedTask<ResultType): Deleted. * wtf/StackShot.h: Added. (WTF::StackShot::StackShot): (WTF::StackShot::operator=): (WTF::StackShot::array const): (WTF::StackShot::size const): (WTF::StackShot::operator bool const): (WTF::StackShot::operator== const): (WTF::StackShot::hash const): (WTF::StackShot::isHashTableDeletedValue const): (WTF::StackShot::operator> const): (WTF::StackShot::deletedValueArray): (WTF::StackShotHash::hash): (WTF::StackShotHash::equal): * wtf/StackShotProfiler.h: Added. (WTF::StackShotProfiler::StackShotProfiler): (WTF::StackShotProfiler::profile): (WTF::StackShotProfiler::run): Tools: * Scripts/run-jsc-benchmarks: Add splay-latency test, since this change needed to be carefully validated with that benchmark. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/ConcurrentPtrHashSet.cpp: Added. This has unit tests of the new concurrent data structure. The tests focus on correctness under serial execution, which appears to be enough for now (it's so easy to catch a concurrency bug by just running the GC). (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/196360@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225524 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-05 17:53:57 +00:00
}
if (!(currentValue & hasParkedBit)) {
dataLog("Lock not parked!\n");
[JSC] Use LazyNeverDestroyed & std::call_once for complex singletons https://bugs.webkit.org/show_bug.cgi?id=215153 <rdar://problem/65718983> Reviewed by Mark Lam. Source/JavaScriptCore: We are getting some crashes in RemoteInspector and this speculatively fixes the crash. My guess is that NeverDestroyed<RemoteInspector> calls constructor twice in heavily contended situation: WebKit's static does not have thread-safety. If two threads come here at the same time, it is possible that constructor is invoked twice. In that case, later constructor will clear members, which involves clearing Lock m_mutex field. This makes Lock's invariant broken. This patch uses LazyNeverDestroyed and std::call_once to ensure invoking constructor only once. * API/glib/JSCVirtualMachine.cpp: * dfg/DFGCommonData.cpp: * disassembler/Disassembler.cpp: * inspector/remote/RemoteInspector.h: * inspector/remote/cocoa/RemoteInspectorCocoa.mm: (Inspector::RemoteInspector::singleton): * inspector/remote/glib/RemoteInspectorGlib.cpp: (Inspector::RemoteInspector::singleton): * inspector/remote/socket/RemoteInspectorServer.cpp: (Inspector::RemoteInspectorServer::singleton): * inspector/remote/socket/RemoteInspectorServer.h: * inspector/remote/socket/RemoteInspectorSocket.cpp: (Inspector::RemoteInspector::singleton): * inspector/remote/socket/RemoteInspectorSocketEndpoint.cpp: (Inspector::RemoteInspectorSocketEndpoint::singleton): * interpreter/Interpreter.cpp: (JSC::Interpreter::opcodeIDTable): * runtime/IntlObject.cpp: (JSC::intlAvailableLocales): (JSC::intlCollatorAvailableLocales): (JSC::defaultLocale): (JSC::numberingSystemsForLocale): Source/WTF: Add lock's bits in crash information to investigate if this speculative fix does not work. * wtf/LockAlgorithmInlines.h: (WTF::Hooks>::lockSlow): (WTF::Hooks>::unlockSlow): Canonical link: https://commits.webkit.org/227957@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@265276 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2020-08-05 04:19:05 +00:00
CRASH_WITH_INFO(currentValue);
GC constraint solving should be parallel https://bugs.webkit.org/show_bug.cgi?id=179934 Reviewed by JF Bastien. PerformanceTests: Added a version of splay that measures latency in a way that run-jsc-benchmarks groks. * Octane/splay.js: Added. (this.Setup.setup.setup): (this.TearDown.tearDown.tearDown): (Benchmark): (BenchmarkResult): (BenchmarkResult.prototype.valueOf): (BenchmarkSuite): (alert): (Math.random): (BenchmarkSuite.ResetRNG): (RunStep): (BenchmarkSuite.RunSuites): (BenchmarkSuite.CountBenchmarks): (BenchmarkSuite.GeometricMean): (BenchmarkSuite.GeometricMeanTime): (BenchmarkSuite.AverageAbovePercentile): (BenchmarkSuite.GeometricMeanLatency): (BenchmarkSuite.FormatScore): (BenchmarkSuite.prototype.NotifyStep): (BenchmarkSuite.prototype.NotifyResult): (BenchmarkSuite.prototype.NotifyError): (BenchmarkSuite.prototype.RunSingleBenchmark): (RunNextSetup): (RunNextBenchmark): (RunNextTearDown): (BenchmarkSuite.prototype.RunStep): (GeneratePayloadTree): (GenerateKey): (SplayUpdateStats): (InsertNewNode): (SplaySetup): (SplayTearDown): (SplayRun): (SplayTree): (SplayTree.prototype.isEmpty): (SplayTree.prototype.insert): (SplayTree.prototype.remove): (SplayTree.prototype.find): (SplayTree.prototype.findMax): (SplayTree.prototype.findGreatestLessThan): (SplayTree.prototype.exportKeys): (SplayTree.prototype.splay_): (SplayTree.Node): (SplayTree.Node.prototype.traverse_): (report): (start): Source/JavaScriptCore: This makes it possible to do constraint solving in parallel. This looks like a 1% Speedometer speed-up. It's more than 1% on trunk-Speedometer. The constraint solver supports running constraints in parallel in two different ways: - Run multiple constraints in parallel to each other. This only works for constraints that can tolerate other constraints running concurrently to them (constraint.concurrency() == ConstraintConcurrency::Concurrent). This is the most basic kind of parallelism that the constraint solver supports. All constraints except the JSC SPI constraints are concurrent. We could probably make them concurrent, but I'm playing it safe for now. - A constraint can create parallel work for itself, which the constraint solver will interleave with other stuff. A constraint can report that it has parallel work by returning ConstraintParallelism::Parallel from its executeImpl() function. Then the solver will allow that constraint's doParallelWorkImpl() function to run on as many GC marker threads as are available, for as long as that function wants to run. It's not possible to have a non-concurrent constraint that creates parallel work. The parallelism is implemented in terms of the existing GC marker threads. This turns out to be most natural for two reasons: - No need to start any other threads. - The constraints all want to be passed a SlotVisitor. Running on the marker threads means having access to those threads' SlotVisitors. Also, it means less load balancing. The solver will create work on each marking thread's SlotVisitor. When the solver is done "stealing" a marker thread, that thread will have work it can start doing immediately. Before this change, we had to contribute the work found by the constraint solver to the global worklist so that it could be distributed to the marker threads by load balancing. This change probably helps to avoid that load balancing step. A lot of this change is about making it easy to iterate GC data structures in parallel. This change makes almost all constraints parallel-enabled, but only the DOM's output constraint uses the parallel work API. That constraint iterates the marked cells in two subspaces. This change makes it very easy to compose parallel iterators over subspaces, allocators, blocks, and cells. The marked cell parallel iterator is composed out of parallel iterators for the others. A parallel iterator is just an iterator that can do an atomic next() very quickly. We abstract them using RefPtr<SharedTask<...()>>, where ... is the type returned from the iterator. We know it's done when it returns a falsish version of ... (in the current code, that's always a pointer type, so done is indicated by null). * API/JSMarkingConstraintPrivate.cpp: (JSContextGroupAddMarkingConstraint): * API/JSVirtualMachine.mm: (scanExternalObjectGraph): (scanExternalRememberedSet): * JavaScriptCore.xcodeproj/project.pbxproj: * Sources.txt: * bytecode/AccessCase.cpp: (JSC::AccessCase::propagateTransitions const): * bytecode/CodeBlock.cpp: (JSC::CodeBlock::visitWeakly): (JSC::CodeBlock::shouldJettisonDueToOldAge): (JSC::shouldMarkTransition): (JSC::CodeBlock::propagateTransitions): (JSC::CodeBlock::determineLiveness): * dfg/DFGWorklist.cpp: * ftl/FTLCompile.cpp: (JSC::FTL::compile): * heap/ConstraintParallelism.h: Added. (WTF::printInternal): * heap/Heap.cpp: (JSC::Heap::Heap): (JSC::Heap::addToRememberedSet): (JSC::Heap::runFixpointPhase): (JSC::Heap::stopThePeriphery): (JSC::Heap::resumeThePeriphery): (JSC::Heap::addCoreConstraints): (JSC::Heap::setBonusVisitorTask): (JSC::Heap::runTaskInParallel): (JSC::Heap::forEachSlotVisitor): Deleted. * heap/Heap.h: (JSC::Heap::worldIsRunning const): (JSC::Heap::runFunctionInParallel): * heap/HeapInlines.h: (JSC::Heap::worldIsStopped const): (JSC::Heap::isMarked): (JSC::Heap::incrementDeferralDepth): (JSC::Heap::decrementDeferralDepth): (JSC::Heap::decrementDeferralDepthAndGCIfNeeded): (JSC::Heap::forEachSlotVisitor): (JSC::Heap::collectorBelievesThatTheWorldIsStopped const): Deleted. (JSC::Heap::isMarkedConcurrently): Deleted. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::appendNode): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): Deleted. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::parallelNotEmptyBlockSource): * heap/MarkedAllocator.h: * heap/MarkedBlock.h: (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::areMarksStaleWithDependency): Deleted. (JSC::MarkedBlock::isMarkedConcurrently): Deleted. * heap/MarkedSpace.h: (JSC::MarkedSpace::activeWeakSetsBegin): (JSC::MarkedSpace::activeWeakSetsEnd): (JSC::MarkedSpace::newActiveWeakSetsBegin): (JSC::MarkedSpace::newActiveWeakSetsEnd): * heap/MarkingConstraint.cpp: (JSC::MarkingConstraint::MarkingConstraint): (JSC::MarkingConstraint::execute): (JSC::MarkingConstraint::quickWorkEstimate): (JSC::MarkingConstraint::workEstimate): (JSC::MarkingConstraint::doParallelWork): (JSC::MarkingConstraint::finishParallelWork): (JSC::MarkingConstraint::doParallelWorkImpl): (JSC::MarkingConstraint::finishParallelWorkImpl): * heap/MarkingConstraint.h: (JSC::MarkingConstraint::lastExecuteParallelism const): (JSC::MarkingConstraint::parallelism const): (JSC::MarkingConstraint::quickWorkEstimate): Deleted. (JSC::MarkingConstraint::workEstimate): Deleted. * heap/MarkingConstraintSet.cpp: (JSC::MarkingConstraintSet::MarkingConstraintSet): (JSC::MarkingConstraintSet::add): (JSC::MarkingConstraintSet::executeConvergence): (JSC::MarkingConstraintSet::executeConvergenceImpl): (JSC::MarkingConstraintSet::executeAll): (JSC::MarkingConstraintSet::ExecutionContext::ExecutionContext): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didVisitSomething const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::shouldTimeOut const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::drain): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didExecute const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::execute): Deleted. (): Deleted. * heap/MarkingConstraintSet.h: * heap/MarkingConstraintSolver.cpp: Added. (JSC::MarkingConstraintSolver::MarkingConstraintSolver): (JSC::MarkingConstraintSolver::~MarkingConstraintSolver): (JSC::MarkingConstraintSolver::didVisitSomething const): (JSC::MarkingConstraintSolver::execute): (JSC::MarkingConstraintSolver::drain): (JSC::MarkingConstraintSolver::converge): (JSC::MarkingConstraintSolver::runExecutionThread): (JSC::MarkingConstraintSolver::didExecute): * heap/MarkingConstraintSolver.h: Added. * heap/OpaqueRootSet.h: Removed. * heap/ParallelSourceAdapter.h: Added. (JSC::ParallelSourceAdapter::ParallelSourceAdapter): (JSC::createParallelSourceAdapter): * heap/SimpleMarkingConstraint.cpp: Added. (JSC::SimpleMarkingConstraint::SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::~SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::quickWorkEstimate): (JSC::SimpleMarkingConstraint::executeImpl): * heap/SimpleMarkingConstraint.h: Added. * heap/SlotVisitor.cpp: (JSC::SlotVisitor::didStartMarking): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::updateMutatorIsStopped): (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate const): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::performIncrementOfDraining): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): (JSC::SlotVisitor::waitForTermination): (JSC::SlotVisitor::addOpaqueRoot): Deleted. (JSC::SlotVisitor::containsOpaqueRoot const): Deleted. (JSC::SlotVisitor::containsOpaqueRootTriState const): Deleted. (JSC::SlotVisitor::mergeIfNecessary): Deleted. (JSC::SlotVisitor::mergeOpaqueRootsIfProfitable): Deleted. (JSC::SlotVisitor::mergeOpaqueRoots): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::addOpaqueRoot): (JSC::SlotVisitor::containsOpaqueRoot const): (JSC::SlotVisitor::vm): (JSC::SlotVisitor::vm const): * heap/Subspace.cpp: (JSC::Subspace::parallelAllocatorSource): (JSC::Subspace::parallelNotEmptyMarkedBlockSource): * heap/Subspace.h: * heap/SubspaceInlines.h: (JSC::Subspace::forEachMarkedCellInParallel): * heap/VisitCounter.h: Added. (JSC::VisitCounter::VisitCounter): (JSC::VisitCounter::visitCount const): * heap/VisitingTimeout.h: Removed. * heap/WeakBlock.cpp: (JSC::WeakBlock::specializedVisit): * runtime/Structure.cpp: (JSC::Structure::isCheapDuringGC): (JSC::Structure::markIfCheap): Source/WebCore: No new tests because no change in behavior. This change is best tested using DOM-GC-intensive benchmarks like Speedometer and Dromaeo. This parallelizes the DOM's output constraint, and makes some small changes to make this more scalable. * ForwardingHeaders/heap/SimpleMarkingConstraint.h: Added. * ForwardingHeaders/heap/VisitingTimeout.h: Removed. * Sources.txt: * WebCore.xcodeproj/project.pbxproj: * bindings/js/DOMGCOutputConstraint.cpp: Added. (WebCore::DOMGCOutputConstraint::DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::~DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::executeImpl): (WebCore::DOMGCOutputConstraint::doParallelWorkImpl): (WebCore::DOMGCOutputConstraint::finishParallelWorkImpl): * bindings/js/DOMGCOutputConstraint.h: Added. * bindings/js/WebCoreJSClientData.cpp: (WebCore::JSVMClientData::initNormalWorld): * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): Source/WTF: This does some changes to make it easier to do parallel constraint solving: - I finally removed dependencyWith. This was a silly construct whose only purpose is to confuse people about what it means to have a dependency chain. I took that as an opportunity to grealy simplify the GC's use of dependency chaining. - Added more logic to Deque<>, since I use it for part of the load balancer. - Made it possible to profile lock contention. See https://bugs.webkit.org/show_bug.cgi?id=180250#c0 for some preliminary measurements. - Introduced holdLockIf, which makes it easy to perform predicated lock acquisition. We use that to pick a lock in WebCore. - Introduced CountingLock. It's like WTF::Lock except it also enables optimistic read transactions sorta like Java's StampedLock. * WTF.xcodeproj/project.pbxproj: * wtf/Atomics.h: (WTF::dependency): (WTF::DependencyWith::DependencyWith): Deleted. (WTF::dependencyWith): Deleted. * wtf/BitVector.h: (WTF::BitVector::iterator::operator++): * wtf/CMakeLists.txt: * wtf/ConcurrentPtrHashSet.cpp: Added. (WTF::ConcurrentPtrHashSet::ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::~ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::deleteOldTables): (WTF::ConcurrentPtrHashSet::clear): (WTF::ConcurrentPtrHashSet::initialize): (WTF::ConcurrentPtrHashSet::addSlow): (WTF::ConcurrentPtrHashSet::resizeIfNecessary): (WTF::ConcurrentPtrHashSet::resizeAndAdd): (WTF::ConcurrentPtrHashSet::Table::create): * wtf/ConcurrentPtrHashSet.h: Added. (WTF::ConcurrentPtrHashSet::contains): (WTF::ConcurrentPtrHashSet::add): (WTF::ConcurrentPtrHashSet::size const): (WTF::ConcurrentPtrHashSet::Table::maxLoad const): (WTF::ConcurrentPtrHashSet::hash): (WTF::ConcurrentPtrHashSet::cast): (WTF::ConcurrentPtrHashSet::containsImpl const): (WTF::ConcurrentPtrHashSet::addImpl): * wtf/Deque.h: (WTF::inlineCapacity>::takeFirst): * wtf/FastMalloc.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): * wtf/Locker.h: (WTF::holdLockIf): * wtf/ScopedLambda.h: * wtf/SharedTask.h: (WTF::SharedTask<PassedResultType): (WTF::SharedTask<ResultType): Deleted. * wtf/StackShot.h: Added. (WTF::StackShot::StackShot): (WTF::StackShot::operator=): (WTF::StackShot::array const): (WTF::StackShot::size const): (WTF::StackShot::operator bool const): (WTF::StackShot::operator== const): (WTF::StackShot::hash const): (WTF::StackShot::isHashTableDeletedValue const): (WTF::StackShot::operator> const): (WTF::StackShot::deletedValueArray): (WTF::StackShotHash::hash): (WTF::StackShotHash::equal): * wtf/StackShotProfiler.h: Added. (WTF::StackShotProfiler::StackShotProfiler): (WTF::StackShotProfiler::profile): (WTF::StackShotProfiler::run): Tools: * Scripts/run-jsc-benchmarks: Add splay-latency test, since this change needed to be carefully validated with that benchmark. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/ConcurrentPtrHashSet.cpp: Added. This has unit tests of the new concurrent data structure. The tests focus on correctness under serial execution, which appears to be enough for now (it's so easy to catch a concurrency bug by just running the GC). (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/196360@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225524 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-05 17:53:57 +00:00
}
It should be easy to decide how WebKit yields https://bugs.webkit.org/show_bug.cgi?id=174298 Reviewed by Saam Barati. Source/bmalloc: Use sched_yield() explicitly. * bmalloc/StaticMutex.cpp: (bmalloc::StaticMutex::lockSlowCase): Source/JavaScriptCore: Use the new WTF::Thread::yield() function for yielding instead of the C++ function. * heap/Heap.cpp: (JSC::Heap::resumeThePeriphery): * heap/VisitingTimeout.h: * runtime/JSCell.cpp: (JSC::JSCell::lockSlow): (JSC::JSCell::unlockSlow): * runtime/JSCell.h: * runtime/JSCellInlines.h: (JSC::JSCell::lock): (JSC::JSCell::unlock): * runtime/JSLock.cpp: (JSC::JSLock::grabAllLocks): * runtime/SamplingProfiler.cpp: Source/WebCore: No new tests because the WebCore change is just a change to how we #include things. * inspector/InspectorPageAgent.h: * inspector/TimelineRecordFactory.h: * workers/Worker.h: * workers/WorkerGlobalScopeProxy.h: * workers/WorkerMessagingProxy.h: Source/WebKitLegacy: * Storage/StorageTracker.h: Source/WTF: Created a Thread::yield() abstraction for sched_yield(), and made WTF use it everywhere that it had previously used std::this_thread::yield(). To make it less annoying to experiment with changes to the lock algorithm in the future, this also moves the meat of the algorithm into LockAlgorithmInlines.h. Only two files include that header. Since LockAlgorithm.h no longer includes ParkingLot.h, a bunch of files in WK now need to include timing headers (Seconds, MonotonicTime, etc) manually. * WTF.xcodeproj/project.pbxproj: * benchmarks/ToyLocks.h: * wtf/CMakeLists.txt: * wtf/Lock.cpp: * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockSlow): Deleted. (WTF::LockAlgorithm::unlockSlow): Deleted. * wtf/LockAlgorithmInlines.h: Added. (WTF::hasParkedBit>::lockSlow): (WTF::hasParkedBit>::unlockSlow): * wtf/MainThread.cpp: * wtf/RunLoopTimer.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPthreads.cpp: (WTF::Thread::yield): * wtf/ThreadingWin.cpp: (WTF::Thread::yield): * wtf/WordLock.cpp: (WTF::WordLockBase::lockSlow): (WTF::WordLockBase::unlockSlow): Canonical link: https://commits.webkit.org/191572@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@219763 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-07-22 14:36:18 +00:00
// We now expect the value to be isHeld|hasParked. So long as that's the case, we can park.
ParkingLot::ParkResult parkResult =
GC constraint solving should be parallel https://bugs.webkit.org/show_bug.cgi?id=179934 Reviewed by JF Bastien. PerformanceTests: Added a version of splay that measures latency in a way that run-jsc-benchmarks groks. * Octane/splay.js: Added. (this.Setup.setup.setup): (this.TearDown.tearDown.tearDown): (Benchmark): (BenchmarkResult): (BenchmarkResult.prototype.valueOf): (BenchmarkSuite): (alert): (Math.random): (BenchmarkSuite.ResetRNG): (RunStep): (BenchmarkSuite.RunSuites): (BenchmarkSuite.CountBenchmarks): (BenchmarkSuite.GeometricMean): (BenchmarkSuite.GeometricMeanTime): (BenchmarkSuite.AverageAbovePercentile): (BenchmarkSuite.GeometricMeanLatency): (BenchmarkSuite.FormatScore): (BenchmarkSuite.prototype.NotifyStep): (BenchmarkSuite.prototype.NotifyResult): (BenchmarkSuite.prototype.NotifyError): (BenchmarkSuite.prototype.RunSingleBenchmark): (RunNextSetup): (RunNextBenchmark): (RunNextTearDown): (BenchmarkSuite.prototype.RunStep): (GeneratePayloadTree): (GenerateKey): (SplayUpdateStats): (InsertNewNode): (SplaySetup): (SplayTearDown): (SplayRun): (SplayTree): (SplayTree.prototype.isEmpty): (SplayTree.prototype.insert): (SplayTree.prototype.remove): (SplayTree.prototype.find): (SplayTree.prototype.findMax): (SplayTree.prototype.findGreatestLessThan): (SplayTree.prototype.exportKeys): (SplayTree.prototype.splay_): (SplayTree.Node): (SplayTree.Node.prototype.traverse_): (report): (start): Source/JavaScriptCore: This makes it possible to do constraint solving in parallel. This looks like a 1% Speedometer speed-up. It's more than 1% on trunk-Speedometer. The constraint solver supports running constraints in parallel in two different ways: - Run multiple constraints in parallel to each other. This only works for constraints that can tolerate other constraints running concurrently to them (constraint.concurrency() == ConstraintConcurrency::Concurrent). This is the most basic kind of parallelism that the constraint solver supports. All constraints except the JSC SPI constraints are concurrent. We could probably make them concurrent, but I'm playing it safe for now. - A constraint can create parallel work for itself, which the constraint solver will interleave with other stuff. A constraint can report that it has parallel work by returning ConstraintParallelism::Parallel from its executeImpl() function. Then the solver will allow that constraint's doParallelWorkImpl() function to run on as many GC marker threads as are available, for as long as that function wants to run. It's not possible to have a non-concurrent constraint that creates parallel work. The parallelism is implemented in terms of the existing GC marker threads. This turns out to be most natural for two reasons: - No need to start any other threads. - The constraints all want to be passed a SlotVisitor. Running on the marker threads means having access to those threads' SlotVisitors. Also, it means less load balancing. The solver will create work on each marking thread's SlotVisitor. When the solver is done "stealing" a marker thread, that thread will have work it can start doing immediately. Before this change, we had to contribute the work found by the constraint solver to the global worklist so that it could be distributed to the marker threads by load balancing. This change probably helps to avoid that load balancing step. A lot of this change is about making it easy to iterate GC data structures in parallel. This change makes almost all constraints parallel-enabled, but only the DOM's output constraint uses the parallel work API. That constraint iterates the marked cells in two subspaces. This change makes it very easy to compose parallel iterators over subspaces, allocators, blocks, and cells. The marked cell parallel iterator is composed out of parallel iterators for the others. A parallel iterator is just an iterator that can do an atomic next() very quickly. We abstract them using RefPtr<SharedTask<...()>>, where ... is the type returned from the iterator. We know it's done when it returns a falsish version of ... (in the current code, that's always a pointer type, so done is indicated by null). * API/JSMarkingConstraintPrivate.cpp: (JSContextGroupAddMarkingConstraint): * API/JSVirtualMachine.mm: (scanExternalObjectGraph): (scanExternalRememberedSet): * JavaScriptCore.xcodeproj/project.pbxproj: * Sources.txt: * bytecode/AccessCase.cpp: (JSC::AccessCase::propagateTransitions const): * bytecode/CodeBlock.cpp: (JSC::CodeBlock::visitWeakly): (JSC::CodeBlock::shouldJettisonDueToOldAge): (JSC::shouldMarkTransition): (JSC::CodeBlock::propagateTransitions): (JSC::CodeBlock::determineLiveness): * dfg/DFGWorklist.cpp: * ftl/FTLCompile.cpp: (JSC::FTL::compile): * heap/ConstraintParallelism.h: Added. (WTF::printInternal): * heap/Heap.cpp: (JSC::Heap::Heap): (JSC::Heap::addToRememberedSet): (JSC::Heap::runFixpointPhase): (JSC::Heap::stopThePeriphery): (JSC::Heap::resumeThePeriphery): (JSC::Heap::addCoreConstraints): (JSC::Heap::setBonusVisitorTask): (JSC::Heap::runTaskInParallel): (JSC::Heap::forEachSlotVisitor): Deleted. * heap/Heap.h: (JSC::Heap::worldIsRunning const): (JSC::Heap::runFunctionInParallel): * heap/HeapInlines.h: (JSC::Heap::worldIsStopped const): (JSC::Heap::isMarked): (JSC::Heap::incrementDeferralDepth): (JSC::Heap::decrementDeferralDepth): (JSC::Heap::decrementDeferralDepthAndGCIfNeeded): (JSC::Heap::forEachSlotVisitor): (JSC::Heap::collectorBelievesThatTheWorldIsStopped const): Deleted. (JSC::Heap::isMarkedConcurrently): Deleted. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::appendNode): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): Deleted. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::parallelNotEmptyBlockSource): * heap/MarkedAllocator.h: * heap/MarkedBlock.h: (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::areMarksStaleWithDependency): Deleted. (JSC::MarkedBlock::isMarkedConcurrently): Deleted. * heap/MarkedSpace.h: (JSC::MarkedSpace::activeWeakSetsBegin): (JSC::MarkedSpace::activeWeakSetsEnd): (JSC::MarkedSpace::newActiveWeakSetsBegin): (JSC::MarkedSpace::newActiveWeakSetsEnd): * heap/MarkingConstraint.cpp: (JSC::MarkingConstraint::MarkingConstraint): (JSC::MarkingConstraint::execute): (JSC::MarkingConstraint::quickWorkEstimate): (JSC::MarkingConstraint::workEstimate): (JSC::MarkingConstraint::doParallelWork): (JSC::MarkingConstraint::finishParallelWork): (JSC::MarkingConstraint::doParallelWorkImpl): (JSC::MarkingConstraint::finishParallelWorkImpl): * heap/MarkingConstraint.h: (JSC::MarkingConstraint::lastExecuteParallelism const): (JSC::MarkingConstraint::parallelism const): (JSC::MarkingConstraint::quickWorkEstimate): Deleted. (JSC::MarkingConstraint::workEstimate): Deleted. * heap/MarkingConstraintSet.cpp: (JSC::MarkingConstraintSet::MarkingConstraintSet): (JSC::MarkingConstraintSet::add): (JSC::MarkingConstraintSet::executeConvergence): (JSC::MarkingConstraintSet::executeConvergenceImpl): (JSC::MarkingConstraintSet::executeAll): (JSC::MarkingConstraintSet::ExecutionContext::ExecutionContext): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didVisitSomething const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::shouldTimeOut const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::drain): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didExecute const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::execute): Deleted. (): Deleted. * heap/MarkingConstraintSet.h: * heap/MarkingConstraintSolver.cpp: Added. (JSC::MarkingConstraintSolver::MarkingConstraintSolver): (JSC::MarkingConstraintSolver::~MarkingConstraintSolver): (JSC::MarkingConstraintSolver::didVisitSomething const): (JSC::MarkingConstraintSolver::execute): (JSC::MarkingConstraintSolver::drain): (JSC::MarkingConstraintSolver::converge): (JSC::MarkingConstraintSolver::runExecutionThread): (JSC::MarkingConstraintSolver::didExecute): * heap/MarkingConstraintSolver.h: Added. * heap/OpaqueRootSet.h: Removed. * heap/ParallelSourceAdapter.h: Added. (JSC::ParallelSourceAdapter::ParallelSourceAdapter): (JSC::createParallelSourceAdapter): * heap/SimpleMarkingConstraint.cpp: Added. (JSC::SimpleMarkingConstraint::SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::~SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::quickWorkEstimate): (JSC::SimpleMarkingConstraint::executeImpl): * heap/SimpleMarkingConstraint.h: Added. * heap/SlotVisitor.cpp: (JSC::SlotVisitor::didStartMarking): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::updateMutatorIsStopped): (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate const): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::performIncrementOfDraining): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): (JSC::SlotVisitor::waitForTermination): (JSC::SlotVisitor::addOpaqueRoot): Deleted. (JSC::SlotVisitor::containsOpaqueRoot const): Deleted. (JSC::SlotVisitor::containsOpaqueRootTriState const): Deleted. (JSC::SlotVisitor::mergeIfNecessary): Deleted. (JSC::SlotVisitor::mergeOpaqueRootsIfProfitable): Deleted. (JSC::SlotVisitor::mergeOpaqueRoots): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::addOpaqueRoot): (JSC::SlotVisitor::containsOpaqueRoot const): (JSC::SlotVisitor::vm): (JSC::SlotVisitor::vm const): * heap/Subspace.cpp: (JSC::Subspace::parallelAllocatorSource): (JSC::Subspace::parallelNotEmptyMarkedBlockSource): * heap/Subspace.h: * heap/SubspaceInlines.h: (JSC::Subspace::forEachMarkedCellInParallel): * heap/VisitCounter.h: Added. (JSC::VisitCounter::VisitCounter): (JSC::VisitCounter::visitCount const): * heap/VisitingTimeout.h: Removed. * heap/WeakBlock.cpp: (JSC::WeakBlock::specializedVisit): * runtime/Structure.cpp: (JSC::Structure::isCheapDuringGC): (JSC::Structure::markIfCheap): Source/WebCore: No new tests because no change in behavior. This change is best tested using DOM-GC-intensive benchmarks like Speedometer and Dromaeo. This parallelizes the DOM's output constraint, and makes some small changes to make this more scalable. * ForwardingHeaders/heap/SimpleMarkingConstraint.h: Added. * ForwardingHeaders/heap/VisitingTimeout.h: Removed. * Sources.txt: * WebCore.xcodeproj/project.pbxproj: * bindings/js/DOMGCOutputConstraint.cpp: Added. (WebCore::DOMGCOutputConstraint::DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::~DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::executeImpl): (WebCore::DOMGCOutputConstraint::doParallelWorkImpl): (WebCore::DOMGCOutputConstraint::finishParallelWorkImpl): * bindings/js/DOMGCOutputConstraint.h: Added. * bindings/js/WebCoreJSClientData.cpp: (WebCore::JSVMClientData::initNormalWorld): * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): Source/WTF: This does some changes to make it easier to do parallel constraint solving: - I finally removed dependencyWith. This was a silly construct whose only purpose is to confuse people about what it means to have a dependency chain. I took that as an opportunity to grealy simplify the GC's use of dependency chaining. - Added more logic to Deque<>, since I use it for part of the load balancer. - Made it possible to profile lock contention. See https://bugs.webkit.org/show_bug.cgi?id=180250#c0 for some preliminary measurements. - Introduced holdLockIf, which makes it easy to perform predicated lock acquisition. We use that to pick a lock in WebCore. - Introduced CountingLock. It's like WTF::Lock except it also enables optimistic read transactions sorta like Java's StampedLock. * WTF.xcodeproj/project.pbxproj: * wtf/Atomics.h: (WTF::dependency): (WTF::DependencyWith::DependencyWith): Deleted. (WTF::dependencyWith): Deleted. * wtf/BitVector.h: (WTF::BitVector::iterator::operator++): * wtf/CMakeLists.txt: * wtf/ConcurrentPtrHashSet.cpp: Added. (WTF::ConcurrentPtrHashSet::ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::~ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::deleteOldTables): (WTF::ConcurrentPtrHashSet::clear): (WTF::ConcurrentPtrHashSet::initialize): (WTF::ConcurrentPtrHashSet::addSlow): (WTF::ConcurrentPtrHashSet::resizeIfNecessary): (WTF::ConcurrentPtrHashSet::resizeAndAdd): (WTF::ConcurrentPtrHashSet::Table::create): * wtf/ConcurrentPtrHashSet.h: Added. (WTF::ConcurrentPtrHashSet::contains): (WTF::ConcurrentPtrHashSet::add): (WTF::ConcurrentPtrHashSet::size const): (WTF::ConcurrentPtrHashSet::Table::maxLoad const): (WTF::ConcurrentPtrHashSet::hash): (WTF::ConcurrentPtrHashSet::cast): (WTF::ConcurrentPtrHashSet::containsImpl const): (WTF::ConcurrentPtrHashSet::addImpl): * wtf/Deque.h: (WTF::inlineCapacity>::takeFirst): * wtf/FastMalloc.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): * wtf/Locker.h: (WTF::holdLockIf): * wtf/ScopedLambda.h: * wtf/SharedTask.h: (WTF::SharedTask<PassedResultType): (WTF::SharedTask<ResultType): Deleted. * wtf/StackShot.h: Added. (WTF::StackShot::StackShot): (WTF::StackShot::operator=): (WTF::StackShot::array const): (WTF::StackShot::size const): (WTF::StackShot::operator bool const): (WTF::StackShot::operator== const): (WTF::StackShot::hash const): (WTF::StackShot::isHashTableDeletedValue const): (WTF::StackShot::operator> const): (WTF::StackShot::deletedValueArray): (WTF::StackShotHash::hash): (WTF::StackShotHash::equal): * wtf/StackShotProfiler.h: Added. (WTF::StackShotProfiler::StackShotProfiler): (WTF::StackShotProfiler::profile): (WTF::StackShotProfiler::run): Tools: * Scripts/run-jsc-benchmarks: Add splay-latency test, since this change needed to be carefully validated with that benchmark. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/ConcurrentPtrHashSet.cpp: Added. This has unit tests of the new concurrent data structure. The tests focus on correctness under serial execution, which appears to be enough for now (it's so easy to catch a concurrency bug by just running the GC). (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/196360@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225524 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-05 17:53:57 +00:00
ParkingLot::compareAndPark(&lock, currentValue);
It should be easy to decide how WebKit yields https://bugs.webkit.org/show_bug.cgi?id=174298 Reviewed by Saam Barati. Source/bmalloc: Use sched_yield() explicitly. * bmalloc/StaticMutex.cpp: (bmalloc::StaticMutex::lockSlowCase): Source/JavaScriptCore: Use the new WTF::Thread::yield() function for yielding instead of the C++ function. * heap/Heap.cpp: (JSC::Heap::resumeThePeriphery): * heap/VisitingTimeout.h: * runtime/JSCell.cpp: (JSC::JSCell::lockSlow): (JSC::JSCell::unlockSlow): * runtime/JSCell.h: * runtime/JSCellInlines.h: (JSC::JSCell::lock): (JSC::JSCell::unlock): * runtime/JSLock.cpp: (JSC::JSLock::grabAllLocks): * runtime/SamplingProfiler.cpp: Source/WebCore: No new tests because the WebCore change is just a change to how we #include things. * inspector/InspectorPageAgent.h: * inspector/TimelineRecordFactory.h: * workers/Worker.h: * workers/WorkerGlobalScopeProxy.h: * workers/WorkerMessagingProxy.h: Source/WebKitLegacy: * Storage/StorageTracker.h: Source/WTF: Created a Thread::yield() abstraction for sched_yield(), and made WTF use it everywhere that it had previously used std::this_thread::yield(). To make it less annoying to experiment with changes to the lock algorithm in the future, this also moves the meat of the algorithm into LockAlgorithmInlines.h. Only two files include that header. Since LockAlgorithm.h no longer includes ParkingLot.h, a bunch of files in WK now need to include timing headers (Seconds, MonotonicTime, etc) manually. * WTF.xcodeproj/project.pbxproj: * benchmarks/ToyLocks.h: * wtf/CMakeLists.txt: * wtf/Lock.cpp: * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockSlow): Deleted. (WTF::LockAlgorithm::unlockSlow): Deleted. * wtf/LockAlgorithmInlines.h: Added. (WTF::hasParkedBit>::lockSlow): (WTF::hasParkedBit>::unlockSlow): * wtf/MainThread.cpp: * wtf/RunLoopTimer.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPthreads.cpp: (WTF::Thread::yield): * wtf/ThreadingWin.cpp: (WTF::Thread::yield): * wtf/WordLock.cpp: (WTF::WordLockBase::lockSlow): (WTF::WordLockBase::unlockSlow): Canonical link: https://commits.webkit.org/191572@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@219763 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-07-22 14:36:18 +00:00
if (parkResult.wasUnparked) {
switch (static_cast<Token>(parkResult.token)) {
case DirectHandoff:
// The lock was never released. It was handed to us directly by the thread that did
// unlock(). This means we're done!
RELEASE_ASSERT(isLocked(lock));
return;
case BargingOpportunity:
// This is the common case. The thread that called unlock() has released the lock,
// and we have been woken up so that we may get an opportunity to grab the lock. But
// other threads may barge, so the best that we can do is loop around and try again.
break;
}
}
// We have awoken, or we never parked because the byte value changed. Either way, we loop
// around and try again.
}
}
GC constraint solving should be parallel https://bugs.webkit.org/show_bug.cgi?id=179934 Reviewed by JF Bastien. PerformanceTests: Added a version of splay that measures latency in a way that run-jsc-benchmarks groks. * Octane/splay.js: Added. (this.Setup.setup.setup): (this.TearDown.tearDown.tearDown): (Benchmark): (BenchmarkResult): (BenchmarkResult.prototype.valueOf): (BenchmarkSuite): (alert): (Math.random): (BenchmarkSuite.ResetRNG): (RunStep): (BenchmarkSuite.RunSuites): (BenchmarkSuite.CountBenchmarks): (BenchmarkSuite.GeometricMean): (BenchmarkSuite.GeometricMeanTime): (BenchmarkSuite.AverageAbovePercentile): (BenchmarkSuite.GeometricMeanLatency): (BenchmarkSuite.FormatScore): (BenchmarkSuite.prototype.NotifyStep): (BenchmarkSuite.prototype.NotifyResult): (BenchmarkSuite.prototype.NotifyError): (BenchmarkSuite.prototype.RunSingleBenchmark): (RunNextSetup): (RunNextBenchmark): (RunNextTearDown): (BenchmarkSuite.prototype.RunStep): (GeneratePayloadTree): (GenerateKey): (SplayUpdateStats): (InsertNewNode): (SplaySetup): (SplayTearDown): (SplayRun): (SplayTree): (SplayTree.prototype.isEmpty): (SplayTree.prototype.insert): (SplayTree.prototype.remove): (SplayTree.prototype.find): (SplayTree.prototype.findMax): (SplayTree.prototype.findGreatestLessThan): (SplayTree.prototype.exportKeys): (SplayTree.prototype.splay_): (SplayTree.Node): (SplayTree.Node.prototype.traverse_): (report): (start): Source/JavaScriptCore: This makes it possible to do constraint solving in parallel. This looks like a 1% Speedometer speed-up. It's more than 1% on trunk-Speedometer. The constraint solver supports running constraints in parallel in two different ways: - Run multiple constraints in parallel to each other. This only works for constraints that can tolerate other constraints running concurrently to them (constraint.concurrency() == ConstraintConcurrency::Concurrent). This is the most basic kind of parallelism that the constraint solver supports. All constraints except the JSC SPI constraints are concurrent. We could probably make them concurrent, but I'm playing it safe for now. - A constraint can create parallel work for itself, which the constraint solver will interleave with other stuff. A constraint can report that it has parallel work by returning ConstraintParallelism::Parallel from its executeImpl() function. Then the solver will allow that constraint's doParallelWorkImpl() function to run on as many GC marker threads as are available, for as long as that function wants to run. It's not possible to have a non-concurrent constraint that creates parallel work. The parallelism is implemented in terms of the existing GC marker threads. This turns out to be most natural for two reasons: - No need to start any other threads. - The constraints all want to be passed a SlotVisitor. Running on the marker threads means having access to those threads' SlotVisitors. Also, it means less load balancing. The solver will create work on each marking thread's SlotVisitor. When the solver is done "stealing" a marker thread, that thread will have work it can start doing immediately. Before this change, we had to contribute the work found by the constraint solver to the global worklist so that it could be distributed to the marker threads by load balancing. This change probably helps to avoid that load balancing step. A lot of this change is about making it easy to iterate GC data structures in parallel. This change makes almost all constraints parallel-enabled, but only the DOM's output constraint uses the parallel work API. That constraint iterates the marked cells in two subspaces. This change makes it very easy to compose parallel iterators over subspaces, allocators, blocks, and cells. The marked cell parallel iterator is composed out of parallel iterators for the others. A parallel iterator is just an iterator that can do an atomic next() very quickly. We abstract them using RefPtr<SharedTask<...()>>, where ... is the type returned from the iterator. We know it's done when it returns a falsish version of ... (in the current code, that's always a pointer type, so done is indicated by null). * API/JSMarkingConstraintPrivate.cpp: (JSContextGroupAddMarkingConstraint): * API/JSVirtualMachine.mm: (scanExternalObjectGraph): (scanExternalRememberedSet): * JavaScriptCore.xcodeproj/project.pbxproj: * Sources.txt: * bytecode/AccessCase.cpp: (JSC::AccessCase::propagateTransitions const): * bytecode/CodeBlock.cpp: (JSC::CodeBlock::visitWeakly): (JSC::CodeBlock::shouldJettisonDueToOldAge): (JSC::shouldMarkTransition): (JSC::CodeBlock::propagateTransitions): (JSC::CodeBlock::determineLiveness): * dfg/DFGWorklist.cpp: * ftl/FTLCompile.cpp: (JSC::FTL::compile): * heap/ConstraintParallelism.h: Added. (WTF::printInternal): * heap/Heap.cpp: (JSC::Heap::Heap): (JSC::Heap::addToRememberedSet): (JSC::Heap::runFixpointPhase): (JSC::Heap::stopThePeriphery): (JSC::Heap::resumeThePeriphery): (JSC::Heap::addCoreConstraints): (JSC::Heap::setBonusVisitorTask): (JSC::Heap::runTaskInParallel): (JSC::Heap::forEachSlotVisitor): Deleted. * heap/Heap.h: (JSC::Heap::worldIsRunning const): (JSC::Heap::runFunctionInParallel): * heap/HeapInlines.h: (JSC::Heap::worldIsStopped const): (JSC::Heap::isMarked): (JSC::Heap::incrementDeferralDepth): (JSC::Heap::decrementDeferralDepth): (JSC::Heap::decrementDeferralDepthAndGCIfNeeded): (JSC::Heap::forEachSlotVisitor): (JSC::Heap::collectorBelievesThatTheWorldIsStopped const): Deleted. (JSC::Heap::isMarkedConcurrently): Deleted. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::appendNode): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): Deleted. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::parallelNotEmptyBlockSource): * heap/MarkedAllocator.h: * heap/MarkedBlock.h: (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::areMarksStaleWithDependency): Deleted. (JSC::MarkedBlock::isMarkedConcurrently): Deleted. * heap/MarkedSpace.h: (JSC::MarkedSpace::activeWeakSetsBegin): (JSC::MarkedSpace::activeWeakSetsEnd): (JSC::MarkedSpace::newActiveWeakSetsBegin): (JSC::MarkedSpace::newActiveWeakSetsEnd): * heap/MarkingConstraint.cpp: (JSC::MarkingConstraint::MarkingConstraint): (JSC::MarkingConstraint::execute): (JSC::MarkingConstraint::quickWorkEstimate): (JSC::MarkingConstraint::workEstimate): (JSC::MarkingConstraint::doParallelWork): (JSC::MarkingConstraint::finishParallelWork): (JSC::MarkingConstraint::doParallelWorkImpl): (JSC::MarkingConstraint::finishParallelWorkImpl): * heap/MarkingConstraint.h: (JSC::MarkingConstraint::lastExecuteParallelism const): (JSC::MarkingConstraint::parallelism const): (JSC::MarkingConstraint::quickWorkEstimate): Deleted. (JSC::MarkingConstraint::workEstimate): Deleted. * heap/MarkingConstraintSet.cpp: (JSC::MarkingConstraintSet::MarkingConstraintSet): (JSC::MarkingConstraintSet::add): (JSC::MarkingConstraintSet::executeConvergence): (JSC::MarkingConstraintSet::executeConvergenceImpl): (JSC::MarkingConstraintSet::executeAll): (JSC::MarkingConstraintSet::ExecutionContext::ExecutionContext): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didVisitSomething const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::shouldTimeOut const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::drain): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didExecute const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::execute): Deleted. (): Deleted. * heap/MarkingConstraintSet.h: * heap/MarkingConstraintSolver.cpp: Added. (JSC::MarkingConstraintSolver::MarkingConstraintSolver): (JSC::MarkingConstraintSolver::~MarkingConstraintSolver): (JSC::MarkingConstraintSolver::didVisitSomething const): (JSC::MarkingConstraintSolver::execute): (JSC::MarkingConstraintSolver::drain): (JSC::MarkingConstraintSolver::converge): (JSC::MarkingConstraintSolver::runExecutionThread): (JSC::MarkingConstraintSolver::didExecute): * heap/MarkingConstraintSolver.h: Added. * heap/OpaqueRootSet.h: Removed. * heap/ParallelSourceAdapter.h: Added. (JSC::ParallelSourceAdapter::ParallelSourceAdapter): (JSC::createParallelSourceAdapter): * heap/SimpleMarkingConstraint.cpp: Added. (JSC::SimpleMarkingConstraint::SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::~SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::quickWorkEstimate): (JSC::SimpleMarkingConstraint::executeImpl): * heap/SimpleMarkingConstraint.h: Added. * heap/SlotVisitor.cpp: (JSC::SlotVisitor::didStartMarking): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::updateMutatorIsStopped): (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate const): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::performIncrementOfDraining): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): (JSC::SlotVisitor::waitForTermination): (JSC::SlotVisitor::addOpaqueRoot): Deleted. (JSC::SlotVisitor::containsOpaqueRoot const): Deleted. (JSC::SlotVisitor::containsOpaqueRootTriState const): Deleted. (JSC::SlotVisitor::mergeIfNecessary): Deleted. (JSC::SlotVisitor::mergeOpaqueRootsIfProfitable): Deleted. (JSC::SlotVisitor::mergeOpaqueRoots): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::addOpaqueRoot): (JSC::SlotVisitor::containsOpaqueRoot const): (JSC::SlotVisitor::vm): (JSC::SlotVisitor::vm const): * heap/Subspace.cpp: (JSC::Subspace::parallelAllocatorSource): (JSC::Subspace::parallelNotEmptyMarkedBlockSource): * heap/Subspace.h: * heap/SubspaceInlines.h: (JSC::Subspace::forEachMarkedCellInParallel): * heap/VisitCounter.h: Added. (JSC::VisitCounter::VisitCounter): (JSC::VisitCounter::visitCount const): * heap/VisitingTimeout.h: Removed. * heap/WeakBlock.cpp: (JSC::WeakBlock::specializedVisit): * runtime/Structure.cpp: (JSC::Structure::isCheapDuringGC): (JSC::Structure::markIfCheap): Source/WebCore: No new tests because no change in behavior. This change is best tested using DOM-GC-intensive benchmarks like Speedometer and Dromaeo. This parallelizes the DOM's output constraint, and makes some small changes to make this more scalable. * ForwardingHeaders/heap/SimpleMarkingConstraint.h: Added. * ForwardingHeaders/heap/VisitingTimeout.h: Removed. * Sources.txt: * WebCore.xcodeproj/project.pbxproj: * bindings/js/DOMGCOutputConstraint.cpp: Added. (WebCore::DOMGCOutputConstraint::DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::~DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::executeImpl): (WebCore::DOMGCOutputConstraint::doParallelWorkImpl): (WebCore::DOMGCOutputConstraint::finishParallelWorkImpl): * bindings/js/DOMGCOutputConstraint.h: Added. * bindings/js/WebCoreJSClientData.cpp: (WebCore::JSVMClientData::initNormalWorld): * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): Source/WTF: This does some changes to make it easier to do parallel constraint solving: - I finally removed dependencyWith. This was a silly construct whose only purpose is to confuse people about what it means to have a dependency chain. I took that as an opportunity to grealy simplify the GC's use of dependency chaining. - Added more logic to Deque<>, since I use it for part of the load balancer. - Made it possible to profile lock contention. See https://bugs.webkit.org/show_bug.cgi?id=180250#c0 for some preliminary measurements. - Introduced holdLockIf, which makes it easy to perform predicated lock acquisition. We use that to pick a lock in WebCore. - Introduced CountingLock. It's like WTF::Lock except it also enables optimistic read transactions sorta like Java's StampedLock. * WTF.xcodeproj/project.pbxproj: * wtf/Atomics.h: (WTF::dependency): (WTF::DependencyWith::DependencyWith): Deleted. (WTF::dependencyWith): Deleted. * wtf/BitVector.h: (WTF::BitVector::iterator::operator++): * wtf/CMakeLists.txt: * wtf/ConcurrentPtrHashSet.cpp: Added. (WTF::ConcurrentPtrHashSet::ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::~ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::deleteOldTables): (WTF::ConcurrentPtrHashSet::clear): (WTF::ConcurrentPtrHashSet::initialize): (WTF::ConcurrentPtrHashSet::addSlow): (WTF::ConcurrentPtrHashSet::resizeIfNecessary): (WTF::ConcurrentPtrHashSet::resizeAndAdd): (WTF::ConcurrentPtrHashSet::Table::create): * wtf/ConcurrentPtrHashSet.h: Added. (WTF::ConcurrentPtrHashSet::contains): (WTF::ConcurrentPtrHashSet::add): (WTF::ConcurrentPtrHashSet::size const): (WTF::ConcurrentPtrHashSet::Table::maxLoad const): (WTF::ConcurrentPtrHashSet::hash): (WTF::ConcurrentPtrHashSet::cast): (WTF::ConcurrentPtrHashSet::containsImpl const): (WTF::ConcurrentPtrHashSet::addImpl): * wtf/Deque.h: (WTF::inlineCapacity>::takeFirst): * wtf/FastMalloc.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): * wtf/Locker.h: (WTF::holdLockIf): * wtf/ScopedLambda.h: * wtf/SharedTask.h: (WTF::SharedTask<PassedResultType): (WTF::SharedTask<ResultType): Deleted. * wtf/StackShot.h: Added. (WTF::StackShot::StackShot): (WTF::StackShot::operator=): (WTF::StackShot::array const): (WTF::StackShot::size const): (WTF::StackShot::operator bool const): (WTF::StackShot::operator== const): (WTF::StackShot::hash const): (WTF::StackShot::isHashTableDeletedValue const): (WTF::StackShot::operator> const): (WTF::StackShot::deletedValueArray): (WTF::StackShotHash::hash): (WTF::StackShotHash::equal): * wtf/StackShotProfiler.h: Added. (WTF::StackShotProfiler::StackShotProfiler): (WTF::StackShotProfiler::profile): (WTF::StackShotProfiler::run): Tools: * Scripts/run-jsc-benchmarks: Add splay-latency test, since this change needed to be carefully validated with that benchmark. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/ConcurrentPtrHashSet.cpp: Added. This has unit tests of the new concurrent data structure. The tests focus on correctness under serial execution, which appears to be enough for now (it's so easy to catch a concurrency bug by just running the GC). (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/196360@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225524 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-05 17:53:57 +00:00
template<typename LockType, LockType isHeldBit, LockType hasParkedBit, typename Hooks>
void LockAlgorithm<LockType, isHeldBit, hasParkedBit, Hooks>::unlockSlow(Atomic<LockType>& lock, Fairness fairness)
It should be easy to decide how WebKit yields https://bugs.webkit.org/show_bug.cgi?id=174298 Reviewed by Saam Barati. Source/bmalloc: Use sched_yield() explicitly. * bmalloc/StaticMutex.cpp: (bmalloc::StaticMutex::lockSlowCase): Source/JavaScriptCore: Use the new WTF::Thread::yield() function for yielding instead of the C++ function. * heap/Heap.cpp: (JSC::Heap::resumeThePeriphery): * heap/VisitingTimeout.h: * runtime/JSCell.cpp: (JSC::JSCell::lockSlow): (JSC::JSCell::unlockSlow): * runtime/JSCell.h: * runtime/JSCellInlines.h: (JSC::JSCell::lock): (JSC::JSCell::unlock): * runtime/JSLock.cpp: (JSC::JSLock::grabAllLocks): * runtime/SamplingProfiler.cpp: Source/WebCore: No new tests because the WebCore change is just a change to how we #include things. * inspector/InspectorPageAgent.h: * inspector/TimelineRecordFactory.h: * workers/Worker.h: * workers/WorkerGlobalScopeProxy.h: * workers/WorkerMessagingProxy.h: Source/WebKitLegacy: * Storage/StorageTracker.h: Source/WTF: Created a Thread::yield() abstraction for sched_yield(), and made WTF use it everywhere that it had previously used std::this_thread::yield(). To make it less annoying to experiment with changes to the lock algorithm in the future, this also moves the meat of the algorithm into LockAlgorithmInlines.h. Only two files include that header. Since LockAlgorithm.h no longer includes ParkingLot.h, a bunch of files in WK now need to include timing headers (Seconds, MonotonicTime, etc) manually. * WTF.xcodeproj/project.pbxproj: * benchmarks/ToyLocks.h: * wtf/CMakeLists.txt: * wtf/Lock.cpp: * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockSlow): Deleted. (WTF::LockAlgorithm::unlockSlow): Deleted. * wtf/LockAlgorithmInlines.h: Added. (WTF::hasParkedBit>::lockSlow): (WTF::hasParkedBit>::unlockSlow): * wtf/MainThread.cpp: * wtf/RunLoopTimer.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPthreads.cpp: (WTF::Thread::yield): * wtf/ThreadingWin.cpp: (WTF::Thread::yield): * wtf/WordLock.cpp: (WTF::WordLockBase::lockSlow): (WTF::WordLockBase::unlockSlow): Canonical link: https://commits.webkit.org/191572@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@219763 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-07-22 14:36:18 +00:00
{
// We could get here because the weak CAS in unlock() failed spuriously, or because there is
// someone parked. So, we need a CAS loop: even if right now the lock is just held, it could
// be held and parked if someone attempts to lock just as we are unlocking.
for (;;) {
uint8_t oldByteValue = lock.load();
if ((oldByteValue & mask) != isHeldBit
&& (oldByteValue & mask) != (isHeldBit | hasParkedBit)) {
dataLog("Invalid value for lock: ", oldByteValue, "\n");
[JSC] Use LazyNeverDestroyed & std::call_once for complex singletons https://bugs.webkit.org/show_bug.cgi?id=215153 <rdar://problem/65718983> Reviewed by Mark Lam. Source/JavaScriptCore: We are getting some crashes in RemoteInspector and this speculatively fixes the crash. My guess is that NeverDestroyed<RemoteInspector> calls constructor twice in heavily contended situation: WebKit's static does not have thread-safety. If two threads come here at the same time, it is possible that constructor is invoked twice. In that case, later constructor will clear members, which involves clearing Lock m_mutex field. This makes Lock's invariant broken. This patch uses LazyNeverDestroyed and std::call_once to ensure invoking constructor only once. * API/glib/JSCVirtualMachine.cpp: * dfg/DFGCommonData.cpp: * disassembler/Disassembler.cpp: * inspector/remote/RemoteInspector.h: * inspector/remote/cocoa/RemoteInspectorCocoa.mm: (Inspector::RemoteInspector::singleton): * inspector/remote/glib/RemoteInspectorGlib.cpp: (Inspector::RemoteInspector::singleton): * inspector/remote/socket/RemoteInspectorServer.cpp: (Inspector::RemoteInspectorServer::singleton): * inspector/remote/socket/RemoteInspectorServer.h: * inspector/remote/socket/RemoteInspectorSocket.cpp: (Inspector::RemoteInspector::singleton): * inspector/remote/socket/RemoteInspectorSocketEndpoint.cpp: (Inspector::RemoteInspectorSocketEndpoint::singleton): * interpreter/Interpreter.cpp: (JSC::Interpreter::opcodeIDTable): * runtime/IntlObject.cpp: (JSC::intlAvailableLocales): (JSC::intlCollatorAvailableLocales): (JSC::defaultLocale): (JSC::numberingSystemsForLocale): Source/WTF: Add lock's bits in crash information to investigate if this speculative fix does not work. * wtf/LockAlgorithmInlines.h: (WTF::Hooks>::lockSlow): (WTF::Hooks>::unlockSlow): Canonical link: https://commits.webkit.org/227957@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@265276 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2020-08-05 04:19:05 +00:00
CRASH_WITH_INFO(oldByteValue);
}
It should be easy to decide how WebKit yields https://bugs.webkit.org/show_bug.cgi?id=174298 Reviewed by Saam Barati. Source/bmalloc: Use sched_yield() explicitly. * bmalloc/StaticMutex.cpp: (bmalloc::StaticMutex::lockSlowCase): Source/JavaScriptCore: Use the new WTF::Thread::yield() function for yielding instead of the C++ function. * heap/Heap.cpp: (JSC::Heap::resumeThePeriphery): * heap/VisitingTimeout.h: * runtime/JSCell.cpp: (JSC::JSCell::lockSlow): (JSC::JSCell::unlockSlow): * runtime/JSCell.h: * runtime/JSCellInlines.h: (JSC::JSCell::lock): (JSC::JSCell::unlock): * runtime/JSLock.cpp: (JSC::JSLock::grabAllLocks): * runtime/SamplingProfiler.cpp: Source/WebCore: No new tests because the WebCore change is just a change to how we #include things. * inspector/InspectorPageAgent.h: * inspector/TimelineRecordFactory.h: * workers/Worker.h: * workers/WorkerGlobalScopeProxy.h: * workers/WorkerMessagingProxy.h: Source/WebKitLegacy: * Storage/StorageTracker.h: Source/WTF: Created a Thread::yield() abstraction for sched_yield(), and made WTF use it everywhere that it had previously used std::this_thread::yield(). To make it less annoying to experiment with changes to the lock algorithm in the future, this also moves the meat of the algorithm into LockAlgorithmInlines.h. Only two files include that header. Since LockAlgorithm.h no longer includes ParkingLot.h, a bunch of files in WK now need to include timing headers (Seconds, MonotonicTime, etc) manually. * WTF.xcodeproj/project.pbxproj: * benchmarks/ToyLocks.h: * wtf/CMakeLists.txt: * wtf/Lock.cpp: * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockSlow): Deleted. (WTF::LockAlgorithm::unlockSlow): Deleted. * wtf/LockAlgorithmInlines.h: Added. (WTF::hasParkedBit>::lockSlow): (WTF::hasParkedBit>::unlockSlow): * wtf/MainThread.cpp: * wtf/RunLoopTimer.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPthreads.cpp: (WTF::Thread::yield): * wtf/ThreadingWin.cpp: (WTF::Thread::yield): * wtf/WordLock.cpp: (WTF::WordLockBase::lockSlow): (WTF::WordLockBase::unlockSlow): Canonical link: https://commits.webkit.org/191572@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@219763 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-07-22 14:36:18 +00:00
if ((oldByteValue & mask) == isHeldBit) {
GC constraint solving should be parallel https://bugs.webkit.org/show_bug.cgi?id=179934 Reviewed by JF Bastien. PerformanceTests: Added a version of splay that measures latency in a way that run-jsc-benchmarks groks. * Octane/splay.js: Added. (this.Setup.setup.setup): (this.TearDown.tearDown.tearDown): (Benchmark): (BenchmarkResult): (BenchmarkResult.prototype.valueOf): (BenchmarkSuite): (alert): (Math.random): (BenchmarkSuite.ResetRNG): (RunStep): (BenchmarkSuite.RunSuites): (BenchmarkSuite.CountBenchmarks): (BenchmarkSuite.GeometricMean): (BenchmarkSuite.GeometricMeanTime): (BenchmarkSuite.AverageAbovePercentile): (BenchmarkSuite.GeometricMeanLatency): (BenchmarkSuite.FormatScore): (BenchmarkSuite.prototype.NotifyStep): (BenchmarkSuite.prototype.NotifyResult): (BenchmarkSuite.prototype.NotifyError): (BenchmarkSuite.prototype.RunSingleBenchmark): (RunNextSetup): (RunNextBenchmark): (RunNextTearDown): (BenchmarkSuite.prototype.RunStep): (GeneratePayloadTree): (GenerateKey): (SplayUpdateStats): (InsertNewNode): (SplaySetup): (SplayTearDown): (SplayRun): (SplayTree): (SplayTree.prototype.isEmpty): (SplayTree.prototype.insert): (SplayTree.prototype.remove): (SplayTree.prototype.find): (SplayTree.prototype.findMax): (SplayTree.prototype.findGreatestLessThan): (SplayTree.prototype.exportKeys): (SplayTree.prototype.splay_): (SplayTree.Node): (SplayTree.Node.prototype.traverse_): (report): (start): Source/JavaScriptCore: This makes it possible to do constraint solving in parallel. This looks like a 1% Speedometer speed-up. It's more than 1% on trunk-Speedometer. The constraint solver supports running constraints in parallel in two different ways: - Run multiple constraints in parallel to each other. This only works for constraints that can tolerate other constraints running concurrently to them (constraint.concurrency() == ConstraintConcurrency::Concurrent). This is the most basic kind of parallelism that the constraint solver supports. All constraints except the JSC SPI constraints are concurrent. We could probably make them concurrent, but I'm playing it safe for now. - A constraint can create parallel work for itself, which the constraint solver will interleave with other stuff. A constraint can report that it has parallel work by returning ConstraintParallelism::Parallel from its executeImpl() function. Then the solver will allow that constraint's doParallelWorkImpl() function to run on as many GC marker threads as are available, for as long as that function wants to run. It's not possible to have a non-concurrent constraint that creates parallel work. The parallelism is implemented in terms of the existing GC marker threads. This turns out to be most natural for two reasons: - No need to start any other threads. - The constraints all want to be passed a SlotVisitor. Running on the marker threads means having access to those threads' SlotVisitors. Also, it means less load balancing. The solver will create work on each marking thread's SlotVisitor. When the solver is done "stealing" a marker thread, that thread will have work it can start doing immediately. Before this change, we had to contribute the work found by the constraint solver to the global worklist so that it could be distributed to the marker threads by load balancing. This change probably helps to avoid that load balancing step. A lot of this change is about making it easy to iterate GC data structures in parallel. This change makes almost all constraints parallel-enabled, but only the DOM's output constraint uses the parallel work API. That constraint iterates the marked cells in two subspaces. This change makes it very easy to compose parallel iterators over subspaces, allocators, blocks, and cells. The marked cell parallel iterator is composed out of parallel iterators for the others. A parallel iterator is just an iterator that can do an atomic next() very quickly. We abstract them using RefPtr<SharedTask<...()>>, where ... is the type returned from the iterator. We know it's done when it returns a falsish version of ... (in the current code, that's always a pointer type, so done is indicated by null). * API/JSMarkingConstraintPrivate.cpp: (JSContextGroupAddMarkingConstraint): * API/JSVirtualMachine.mm: (scanExternalObjectGraph): (scanExternalRememberedSet): * JavaScriptCore.xcodeproj/project.pbxproj: * Sources.txt: * bytecode/AccessCase.cpp: (JSC::AccessCase::propagateTransitions const): * bytecode/CodeBlock.cpp: (JSC::CodeBlock::visitWeakly): (JSC::CodeBlock::shouldJettisonDueToOldAge): (JSC::shouldMarkTransition): (JSC::CodeBlock::propagateTransitions): (JSC::CodeBlock::determineLiveness): * dfg/DFGWorklist.cpp: * ftl/FTLCompile.cpp: (JSC::FTL::compile): * heap/ConstraintParallelism.h: Added. (WTF::printInternal): * heap/Heap.cpp: (JSC::Heap::Heap): (JSC::Heap::addToRememberedSet): (JSC::Heap::runFixpointPhase): (JSC::Heap::stopThePeriphery): (JSC::Heap::resumeThePeriphery): (JSC::Heap::addCoreConstraints): (JSC::Heap::setBonusVisitorTask): (JSC::Heap::runTaskInParallel): (JSC::Heap::forEachSlotVisitor): Deleted. * heap/Heap.h: (JSC::Heap::worldIsRunning const): (JSC::Heap::runFunctionInParallel): * heap/HeapInlines.h: (JSC::Heap::worldIsStopped const): (JSC::Heap::isMarked): (JSC::Heap::incrementDeferralDepth): (JSC::Heap::decrementDeferralDepth): (JSC::Heap::decrementDeferralDepthAndGCIfNeeded): (JSC::Heap::forEachSlotVisitor): (JSC::Heap::collectorBelievesThatTheWorldIsStopped const): Deleted. (JSC::Heap::isMarkedConcurrently): Deleted. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::appendNode): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): Deleted. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::parallelNotEmptyBlockSource): * heap/MarkedAllocator.h: * heap/MarkedBlock.h: (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::areMarksStaleWithDependency): Deleted. (JSC::MarkedBlock::isMarkedConcurrently): Deleted. * heap/MarkedSpace.h: (JSC::MarkedSpace::activeWeakSetsBegin): (JSC::MarkedSpace::activeWeakSetsEnd): (JSC::MarkedSpace::newActiveWeakSetsBegin): (JSC::MarkedSpace::newActiveWeakSetsEnd): * heap/MarkingConstraint.cpp: (JSC::MarkingConstraint::MarkingConstraint): (JSC::MarkingConstraint::execute): (JSC::MarkingConstraint::quickWorkEstimate): (JSC::MarkingConstraint::workEstimate): (JSC::MarkingConstraint::doParallelWork): (JSC::MarkingConstraint::finishParallelWork): (JSC::MarkingConstraint::doParallelWorkImpl): (JSC::MarkingConstraint::finishParallelWorkImpl): * heap/MarkingConstraint.h: (JSC::MarkingConstraint::lastExecuteParallelism const): (JSC::MarkingConstraint::parallelism const): (JSC::MarkingConstraint::quickWorkEstimate): Deleted. (JSC::MarkingConstraint::workEstimate): Deleted. * heap/MarkingConstraintSet.cpp: (JSC::MarkingConstraintSet::MarkingConstraintSet): (JSC::MarkingConstraintSet::add): (JSC::MarkingConstraintSet::executeConvergence): (JSC::MarkingConstraintSet::executeConvergenceImpl): (JSC::MarkingConstraintSet::executeAll): (JSC::MarkingConstraintSet::ExecutionContext::ExecutionContext): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didVisitSomething const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::shouldTimeOut const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::drain): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didExecute const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::execute): Deleted. (): Deleted. * heap/MarkingConstraintSet.h: * heap/MarkingConstraintSolver.cpp: Added. (JSC::MarkingConstraintSolver::MarkingConstraintSolver): (JSC::MarkingConstraintSolver::~MarkingConstraintSolver): (JSC::MarkingConstraintSolver::didVisitSomething const): (JSC::MarkingConstraintSolver::execute): (JSC::MarkingConstraintSolver::drain): (JSC::MarkingConstraintSolver::converge): (JSC::MarkingConstraintSolver::runExecutionThread): (JSC::MarkingConstraintSolver::didExecute): * heap/MarkingConstraintSolver.h: Added. * heap/OpaqueRootSet.h: Removed. * heap/ParallelSourceAdapter.h: Added. (JSC::ParallelSourceAdapter::ParallelSourceAdapter): (JSC::createParallelSourceAdapter): * heap/SimpleMarkingConstraint.cpp: Added. (JSC::SimpleMarkingConstraint::SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::~SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::quickWorkEstimate): (JSC::SimpleMarkingConstraint::executeImpl): * heap/SimpleMarkingConstraint.h: Added. * heap/SlotVisitor.cpp: (JSC::SlotVisitor::didStartMarking): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::updateMutatorIsStopped): (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate const): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::performIncrementOfDraining): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): (JSC::SlotVisitor::waitForTermination): (JSC::SlotVisitor::addOpaqueRoot): Deleted. (JSC::SlotVisitor::containsOpaqueRoot const): Deleted. (JSC::SlotVisitor::containsOpaqueRootTriState const): Deleted. (JSC::SlotVisitor::mergeIfNecessary): Deleted. (JSC::SlotVisitor::mergeOpaqueRootsIfProfitable): Deleted. (JSC::SlotVisitor::mergeOpaqueRoots): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::addOpaqueRoot): (JSC::SlotVisitor::containsOpaqueRoot const): (JSC::SlotVisitor::vm): (JSC::SlotVisitor::vm const): * heap/Subspace.cpp: (JSC::Subspace::parallelAllocatorSource): (JSC::Subspace::parallelNotEmptyMarkedBlockSource): * heap/Subspace.h: * heap/SubspaceInlines.h: (JSC::Subspace::forEachMarkedCellInParallel): * heap/VisitCounter.h: Added. (JSC::VisitCounter::VisitCounter): (JSC::VisitCounter::visitCount const): * heap/VisitingTimeout.h: Removed. * heap/WeakBlock.cpp: (JSC::WeakBlock::specializedVisit): * runtime/Structure.cpp: (JSC::Structure::isCheapDuringGC): (JSC::Structure::markIfCheap): Source/WebCore: No new tests because no change in behavior. This change is best tested using DOM-GC-intensive benchmarks like Speedometer and Dromaeo. This parallelizes the DOM's output constraint, and makes some small changes to make this more scalable. * ForwardingHeaders/heap/SimpleMarkingConstraint.h: Added. * ForwardingHeaders/heap/VisitingTimeout.h: Removed. * Sources.txt: * WebCore.xcodeproj/project.pbxproj: * bindings/js/DOMGCOutputConstraint.cpp: Added. (WebCore::DOMGCOutputConstraint::DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::~DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::executeImpl): (WebCore::DOMGCOutputConstraint::doParallelWorkImpl): (WebCore::DOMGCOutputConstraint::finishParallelWorkImpl): * bindings/js/DOMGCOutputConstraint.h: Added. * bindings/js/WebCoreJSClientData.cpp: (WebCore::JSVMClientData::initNormalWorld): * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): Source/WTF: This does some changes to make it easier to do parallel constraint solving: - I finally removed dependencyWith. This was a silly construct whose only purpose is to confuse people about what it means to have a dependency chain. I took that as an opportunity to grealy simplify the GC's use of dependency chaining. - Added more logic to Deque<>, since I use it for part of the load balancer. - Made it possible to profile lock contention. See https://bugs.webkit.org/show_bug.cgi?id=180250#c0 for some preliminary measurements. - Introduced holdLockIf, which makes it easy to perform predicated lock acquisition. We use that to pick a lock in WebCore. - Introduced CountingLock. It's like WTF::Lock except it also enables optimistic read transactions sorta like Java's StampedLock. * WTF.xcodeproj/project.pbxproj: * wtf/Atomics.h: (WTF::dependency): (WTF::DependencyWith::DependencyWith): Deleted. (WTF::dependencyWith): Deleted. * wtf/BitVector.h: (WTF::BitVector::iterator::operator++): * wtf/CMakeLists.txt: * wtf/ConcurrentPtrHashSet.cpp: Added. (WTF::ConcurrentPtrHashSet::ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::~ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::deleteOldTables): (WTF::ConcurrentPtrHashSet::clear): (WTF::ConcurrentPtrHashSet::initialize): (WTF::ConcurrentPtrHashSet::addSlow): (WTF::ConcurrentPtrHashSet::resizeIfNecessary): (WTF::ConcurrentPtrHashSet::resizeAndAdd): (WTF::ConcurrentPtrHashSet::Table::create): * wtf/ConcurrentPtrHashSet.h: Added. (WTF::ConcurrentPtrHashSet::contains): (WTF::ConcurrentPtrHashSet::add): (WTF::ConcurrentPtrHashSet::size const): (WTF::ConcurrentPtrHashSet::Table::maxLoad const): (WTF::ConcurrentPtrHashSet::hash): (WTF::ConcurrentPtrHashSet::cast): (WTF::ConcurrentPtrHashSet::containsImpl const): (WTF::ConcurrentPtrHashSet::addImpl): * wtf/Deque.h: (WTF::inlineCapacity>::takeFirst): * wtf/FastMalloc.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): * wtf/Locker.h: (WTF::holdLockIf): * wtf/ScopedLambda.h: * wtf/SharedTask.h: (WTF::SharedTask<PassedResultType): (WTF::SharedTask<ResultType): Deleted. * wtf/StackShot.h: Added. (WTF::StackShot::StackShot): (WTF::StackShot::operator=): (WTF::StackShot::array const): (WTF::StackShot::size const): (WTF::StackShot::operator bool const): (WTF::StackShot::operator== const): (WTF::StackShot::hash const): (WTF::StackShot::isHashTableDeletedValue const): (WTF::StackShot::operator> const): (WTF::StackShot::deletedValueArray): (WTF::StackShotHash::hash): (WTF::StackShotHash::equal): * wtf/StackShotProfiler.h: Added. (WTF::StackShotProfiler::StackShotProfiler): (WTF::StackShotProfiler::profile): (WTF::StackShotProfiler::run): Tools: * Scripts/run-jsc-benchmarks: Add splay-latency test, since this change needed to be carefully validated with that benchmark. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/ConcurrentPtrHashSet.cpp: Added. This has unit tests of the new concurrent data structure. The tests focus on correctness under serial execution, which appears to be enough for now (it's so easy to catch a concurrency bug by just running the GC). (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/196360@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225524 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-05 17:53:57 +00:00
if (lock.compareExchangeWeak(oldByteValue, Hooks::unlockHook(oldByteValue & ~isHeldBit)))
It should be easy to decide how WebKit yields https://bugs.webkit.org/show_bug.cgi?id=174298 Reviewed by Saam Barati. Source/bmalloc: Use sched_yield() explicitly. * bmalloc/StaticMutex.cpp: (bmalloc::StaticMutex::lockSlowCase): Source/JavaScriptCore: Use the new WTF::Thread::yield() function for yielding instead of the C++ function. * heap/Heap.cpp: (JSC::Heap::resumeThePeriphery): * heap/VisitingTimeout.h: * runtime/JSCell.cpp: (JSC::JSCell::lockSlow): (JSC::JSCell::unlockSlow): * runtime/JSCell.h: * runtime/JSCellInlines.h: (JSC::JSCell::lock): (JSC::JSCell::unlock): * runtime/JSLock.cpp: (JSC::JSLock::grabAllLocks): * runtime/SamplingProfiler.cpp: Source/WebCore: No new tests because the WebCore change is just a change to how we #include things. * inspector/InspectorPageAgent.h: * inspector/TimelineRecordFactory.h: * workers/Worker.h: * workers/WorkerGlobalScopeProxy.h: * workers/WorkerMessagingProxy.h: Source/WebKitLegacy: * Storage/StorageTracker.h: Source/WTF: Created a Thread::yield() abstraction for sched_yield(), and made WTF use it everywhere that it had previously used std::this_thread::yield(). To make it less annoying to experiment with changes to the lock algorithm in the future, this also moves the meat of the algorithm into LockAlgorithmInlines.h. Only two files include that header. Since LockAlgorithm.h no longer includes ParkingLot.h, a bunch of files in WK now need to include timing headers (Seconds, MonotonicTime, etc) manually. * WTF.xcodeproj/project.pbxproj: * benchmarks/ToyLocks.h: * wtf/CMakeLists.txt: * wtf/Lock.cpp: * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockSlow): Deleted. (WTF::LockAlgorithm::unlockSlow): Deleted. * wtf/LockAlgorithmInlines.h: Added. (WTF::hasParkedBit>::lockSlow): (WTF::hasParkedBit>::unlockSlow): * wtf/MainThread.cpp: * wtf/RunLoopTimer.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPthreads.cpp: (WTF::Thread::yield): * wtf/ThreadingWin.cpp: (WTF::Thread::yield): * wtf/WordLock.cpp: (WTF::WordLockBase::lockSlow): (WTF::WordLockBase::unlockSlow): Canonical link: https://commits.webkit.org/191572@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@219763 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-07-22 14:36:18 +00:00
return;
continue;
}
// Someone is parked. Unpark exactly one thread. We may hand the lock to that thread
// directly, or we will unlock the lock at the same time as we unpark to allow for barging.
// When we unlock, we may leave the parked bit set if there is a chance that there are still
// other threads parked.
ASSERT((oldByteValue & mask) == (isHeldBit | hasParkedBit));
ParkingLot::unparkOne(
&lock,
[&] (ParkingLot::UnparkResult result) -> intptr_t {
// We are the only ones that can clear either the isHeldBit or the hasParkedBit,
// so we should still see both bits set right now.
ASSERT((lock.load() & mask) == (isHeldBit | hasParkedBit));
if (result.didUnparkThread && (fairness == Fair || result.timeToBeFair)) {
// We don't unlock anything. Instead, we hand the lock to the thread that was
// waiting.
GC constraint solving should be parallel https://bugs.webkit.org/show_bug.cgi?id=179934 Reviewed by JF Bastien. PerformanceTests: Added a version of splay that measures latency in a way that run-jsc-benchmarks groks. * Octane/splay.js: Added. (this.Setup.setup.setup): (this.TearDown.tearDown.tearDown): (Benchmark): (BenchmarkResult): (BenchmarkResult.prototype.valueOf): (BenchmarkSuite): (alert): (Math.random): (BenchmarkSuite.ResetRNG): (RunStep): (BenchmarkSuite.RunSuites): (BenchmarkSuite.CountBenchmarks): (BenchmarkSuite.GeometricMean): (BenchmarkSuite.GeometricMeanTime): (BenchmarkSuite.AverageAbovePercentile): (BenchmarkSuite.GeometricMeanLatency): (BenchmarkSuite.FormatScore): (BenchmarkSuite.prototype.NotifyStep): (BenchmarkSuite.prototype.NotifyResult): (BenchmarkSuite.prototype.NotifyError): (BenchmarkSuite.prototype.RunSingleBenchmark): (RunNextSetup): (RunNextBenchmark): (RunNextTearDown): (BenchmarkSuite.prototype.RunStep): (GeneratePayloadTree): (GenerateKey): (SplayUpdateStats): (InsertNewNode): (SplaySetup): (SplayTearDown): (SplayRun): (SplayTree): (SplayTree.prototype.isEmpty): (SplayTree.prototype.insert): (SplayTree.prototype.remove): (SplayTree.prototype.find): (SplayTree.prototype.findMax): (SplayTree.prototype.findGreatestLessThan): (SplayTree.prototype.exportKeys): (SplayTree.prototype.splay_): (SplayTree.Node): (SplayTree.Node.prototype.traverse_): (report): (start): Source/JavaScriptCore: This makes it possible to do constraint solving in parallel. This looks like a 1% Speedometer speed-up. It's more than 1% on trunk-Speedometer. The constraint solver supports running constraints in parallel in two different ways: - Run multiple constraints in parallel to each other. This only works for constraints that can tolerate other constraints running concurrently to them (constraint.concurrency() == ConstraintConcurrency::Concurrent). This is the most basic kind of parallelism that the constraint solver supports. All constraints except the JSC SPI constraints are concurrent. We could probably make them concurrent, but I'm playing it safe for now. - A constraint can create parallel work for itself, which the constraint solver will interleave with other stuff. A constraint can report that it has parallel work by returning ConstraintParallelism::Parallel from its executeImpl() function. Then the solver will allow that constraint's doParallelWorkImpl() function to run on as many GC marker threads as are available, for as long as that function wants to run. It's not possible to have a non-concurrent constraint that creates parallel work. The parallelism is implemented in terms of the existing GC marker threads. This turns out to be most natural for two reasons: - No need to start any other threads. - The constraints all want to be passed a SlotVisitor. Running on the marker threads means having access to those threads' SlotVisitors. Also, it means less load balancing. The solver will create work on each marking thread's SlotVisitor. When the solver is done "stealing" a marker thread, that thread will have work it can start doing immediately. Before this change, we had to contribute the work found by the constraint solver to the global worklist so that it could be distributed to the marker threads by load balancing. This change probably helps to avoid that load balancing step. A lot of this change is about making it easy to iterate GC data structures in parallel. This change makes almost all constraints parallel-enabled, but only the DOM's output constraint uses the parallel work API. That constraint iterates the marked cells in two subspaces. This change makes it very easy to compose parallel iterators over subspaces, allocators, blocks, and cells. The marked cell parallel iterator is composed out of parallel iterators for the others. A parallel iterator is just an iterator that can do an atomic next() very quickly. We abstract them using RefPtr<SharedTask<...()>>, where ... is the type returned from the iterator. We know it's done when it returns a falsish version of ... (in the current code, that's always a pointer type, so done is indicated by null). * API/JSMarkingConstraintPrivate.cpp: (JSContextGroupAddMarkingConstraint): * API/JSVirtualMachine.mm: (scanExternalObjectGraph): (scanExternalRememberedSet): * JavaScriptCore.xcodeproj/project.pbxproj: * Sources.txt: * bytecode/AccessCase.cpp: (JSC::AccessCase::propagateTransitions const): * bytecode/CodeBlock.cpp: (JSC::CodeBlock::visitWeakly): (JSC::CodeBlock::shouldJettisonDueToOldAge): (JSC::shouldMarkTransition): (JSC::CodeBlock::propagateTransitions): (JSC::CodeBlock::determineLiveness): * dfg/DFGWorklist.cpp: * ftl/FTLCompile.cpp: (JSC::FTL::compile): * heap/ConstraintParallelism.h: Added. (WTF::printInternal): * heap/Heap.cpp: (JSC::Heap::Heap): (JSC::Heap::addToRememberedSet): (JSC::Heap::runFixpointPhase): (JSC::Heap::stopThePeriphery): (JSC::Heap::resumeThePeriphery): (JSC::Heap::addCoreConstraints): (JSC::Heap::setBonusVisitorTask): (JSC::Heap::runTaskInParallel): (JSC::Heap::forEachSlotVisitor): Deleted. * heap/Heap.h: (JSC::Heap::worldIsRunning const): (JSC::Heap::runFunctionInParallel): * heap/HeapInlines.h: (JSC::Heap::worldIsStopped const): (JSC::Heap::isMarked): (JSC::Heap::incrementDeferralDepth): (JSC::Heap::decrementDeferralDepth): (JSC::Heap::decrementDeferralDepthAndGCIfNeeded): (JSC::Heap::forEachSlotVisitor): (JSC::Heap::collectorBelievesThatTheWorldIsStopped const): Deleted. (JSC::Heap::isMarkedConcurrently): Deleted. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::appendNode): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): Deleted. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::parallelNotEmptyBlockSource): * heap/MarkedAllocator.h: * heap/MarkedBlock.h: (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::areMarksStaleWithDependency): Deleted. (JSC::MarkedBlock::isMarkedConcurrently): Deleted. * heap/MarkedSpace.h: (JSC::MarkedSpace::activeWeakSetsBegin): (JSC::MarkedSpace::activeWeakSetsEnd): (JSC::MarkedSpace::newActiveWeakSetsBegin): (JSC::MarkedSpace::newActiveWeakSetsEnd): * heap/MarkingConstraint.cpp: (JSC::MarkingConstraint::MarkingConstraint): (JSC::MarkingConstraint::execute): (JSC::MarkingConstraint::quickWorkEstimate): (JSC::MarkingConstraint::workEstimate): (JSC::MarkingConstraint::doParallelWork): (JSC::MarkingConstraint::finishParallelWork): (JSC::MarkingConstraint::doParallelWorkImpl): (JSC::MarkingConstraint::finishParallelWorkImpl): * heap/MarkingConstraint.h: (JSC::MarkingConstraint::lastExecuteParallelism const): (JSC::MarkingConstraint::parallelism const): (JSC::MarkingConstraint::quickWorkEstimate): Deleted. (JSC::MarkingConstraint::workEstimate): Deleted. * heap/MarkingConstraintSet.cpp: (JSC::MarkingConstraintSet::MarkingConstraintSet): (JSC::MarkingConstraintSet::add): (JSC::MarkingConstraintSet::executeConvergence): (JSC::MarkingConstraintSet::executeConvergenceImpl): (JSC::MarkingConstraintSet::executeAll): (JSC::MarkingConstraintSet::ExecutionContext::ExecutionContext): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didVisitSomething const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::shouldTimeOut const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::drain): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didExecute const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::execute): Deleted. (): Deleted. * heap/MarkingConstraintSet.h: * heap/MarkingConstraintSolver.cpp: Added. (JSC::MarkingConstraintSolver::MarkingConstraintSolver): (JSC::MarkingConstraintSolver::~MarkingConstraintSolver): (JSC::MarkingConstraintSolver::didVisitSomething const): (JSC::MarkingConstraintSolver::execute): (JSC::MarkingConstraintSolver::drain): (JSC::MarkingConstraintSolver::converge): (JSC::MarkingConstraintSolver::runExecutionThread): (JSC::MarkingConstraintSolver::didExecute): * heap/MarkingConstraintSolver.h: Added. * heap/OpaqueRootSet.h: Removed. * heap/ParallelSourceAdapter.h: Added. (JSC::ParallelSourceAdapter::ParallelSourceAdapter): (JSC::createParallelSourceAdapter): * heap/SimpleMarkingConstraint.cpp: Added. (JSC::SimpleMarkingConstraint::SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::~SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::quickWorkEstimate): (JSC::SimpleMarkingConstraint::executeImpl): * heap/SimpleMarkingConstraint.h: Added. * heap/SlotVisitor.cpp: (JSC::SlotVisitor::didStartMarking): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::updateMutatorIsStopped): (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate const): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::performIncrementOfDraining): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): (JSC::SlotVisitor::waitForTermination): (JSC::SlotVisitor::addOpaqueRoot): Deleted. (JSC::SlotVisitor::containsOpaqueRoot const): Deleted. (JSC::SlotVisitor::containsOpaqueRootTriState const): Deleted. (JSC::SlotVisitor::mergeIfNecessary): Deleted. (JSC::SlotVisitor::mergeOpaqueRootsIfProfitable): Deleted. (JSC::SlotVisitor::mergeOpaqueRoots): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::addOpaqueRoot): (JSC::SlotVisitor::containsOpaqueRoot const): (JSC::SlotVisitor::vm): (JSC::SlotVisitor::vm const): * heap/Subspace.cpp: (JSC::Subspace::parallelAllocatorSource): (JSC::Subspace::parallelNotEmptyMarkedBlockSource): * heap/Subspace.h: * heap/SubspaceInlines.h: (JSC::Subspace::forEachMarkedCellInParallel): * heap/VisitCounter.h: Added. (JSC::VisitCounter::VisitCounter): (JSC::VisitCounter::visitCount const): * heap/VisitingTimeout.h: Removed. * heap/WeakBlock.cpp: (JSC::WeakBlock::specializedVisit): * runtime/Structure.cpp: (JSC::Structure::isCheapDuringGC): (JSC::Structure::markIfCheap): Source/WebCore: No new tests because no change in behavior. This change is best tested using DOM-GC-intensive benchmarks like Speedometer and Dromaeo. This parallelizes the DOM's output constraint, and makes some small changes to make this more scalable. * ForwardingHeaders/heap/SimpleMarkingConstraint.h: Added. * ForwardingHeaders/heap/VisitingTimeout.h: Removed. * Sources.txt: * WebCore.xcodeproj/project.pbxproj: * bindings/js/DOMGCOutputConstraint.cpp: Added. (WebCore::DOMGCOutputConstraint::DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::~DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::executeImpl): (WebCore::DOMGCOutputConstraint::doParallelWorkImpl): (WebCore::DOMGCOutputConstraint::finishParallelWorkImpl): * bindings/js/DOMGCOutputConstraint.h: Added. * bindings/js/WebCoreJSClientData.cpp: (WebCore::JSVMClientData::initNormalWorld): * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): Source/WTF: This does some changes to make it easier to do parallel constraint solving: - I finally removed dependencyWith. This was a silly construct whose only purpose is to confuse people about what it means to have a dependency chain. I took that as an opportunity to grealy simplify the GC's use of dependency chaining. - Added more logic to Deque<>, since I use it for part of the load balancer. - Made it possible to profile lock contention. See https://bugs.webkit.org/show_bug.cgi?id=180250#c0 for some preliminary measurements. - Introduced holdLockIf, which makes it easy to perform predicated lock acquisition. We use that to pick a lock in WebCore. - Introduced CountingLock. It's like WTF::Lock except it also enables optimistic read transactions sorta like Java's StampedLock. * WTF.xcodeproj/project.pbxproj: * wtf/Atomics.h: (WTF::dependency): (WTF::DependencyWith::DependencyWith): Deleted. (WTF::dependencyWith): Deleted. * wtf/BitVector.h: (WTF::BitVector::iterator::operator++): * wtf/CMakeLists.txt: * wtf/ConcurrentPtrHashSet.cpp: Added. (WTF::ConcurrentPtrHashSet::ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::~ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::deleteOldTables): (WTF::ConcurrentPtrHashSet::clear): (WTF::ConcurrentPtrHashSet::initialize): (WTF::ConcurrentPtrHashSet::addSlow): (WTF::ConcurrentPtrHashSet::resizeIfNecessary): (WTF::ConcurrentPtrHashSet::resizeAndAdd): (WTF::ConcurrentPtrHashSet::Table::create): * wtf/ConcurrentPtrHashSet.h: Added. (WTF::ConcurrentPtrHashSet::contains): (WTF::ConcurrentPtrHashSet::add): (WTF::ConcurrentPtrHashSet::size const): (WTF::ConcurrentPtrHashSet::Table::maxLoad const): (WTF::ConcurrentPtrHashSet::hash): (WTF::ConcurrentPtrHashSet::cast): (WTF::ConcurrentPtrHashSet::containsImpl const): (WTF::ConcurrentPtrHashSet::addImpl): * wtf/Deque.h: (WTF::inlineCapacity>::takeFirst): * wtf/FastMalloc.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): * wtf/Locker.h: (WTF::holdLockIf): * wtf/ScopedLambda.h: * wtf/SharedTask.h: (WTF::SharedTask<PassedResultType): (WTF::SharedTask<ResultType): Deleted. * wtf/StackShot.h: Added. (WTF::StackShot::StackShot): (WTF::StackShot::operator=): (WTF::StackShot::array const): (WTF::StackShot::size const): (WTF::StackShot::operator bool const): (WTF::StackShot::operator== const): (WTF::StackShot::hash const): (WTF::StackShot::isHashTableDeletedValue const): (WTF::StackShot::operator> const): (WTF::StackShot::deletedValueArray): (WTF::StackShotHash::hash): (WTF::StackShotHash::equal): * wtf/StackShotProfiler.h: Added. (WTF::StackShotProfiler::StackShotProfiler): (WTF::StackShotProfiler::profile): (WTF::StackShotProfiler::run): Tools: * Scripts/run-jsc-benchmarks: Add splay-latency test, since this change needed to be carefully validated with that benchmark. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/ConcurrentPtrHashSet.cpp: Added. This has unit tests of the new concurrent data structure. The tests focus on correctness under serial execution, which appears to be enough for now (it's so easy to catch a concurrency bug by just running the GC). (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/196360@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225524 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-05 17:53:57 +00:00
lock.transaction(
[&] (LockType& value) -> bool {
LockType newValue = Hooks::handoffHook(value);
if (newValue == value)
return false;
value = newValue;
return true;
});
It should be easy to decide how WebKit yields https://bugs.webkit.org/show_bug.cgi?id=174298 Reviewed by Saam Barati. Source/bmalloc: Use sched_yield() explicitly. * bmalloc/StaticMutex.cpp: (bmalloc::StaticMutex::lockSlowCase): Source/JavaScriptCore: Use the new WTF::Thread::yield() function for yielding instead of the C++ function. * heap/Heap.cpp: (JSC::Heap::resumeThePeriphery): * heap/VisitingTimeout.h: * runtime/JSCell.cpp: (JSC::JSCell::lockSlow): (JSC::JSCell::unlockSlow): * runtime/JSCell.h: * runtime/JSCellInlines.h: (JSC::JSCell::lock): (JSC::JSCell::unlock): * runtime/JSLock.cpp: (JSC::JSLock::grabAllLocks): * runtime/SamplingProfiler.cpp: Source/WebCore: No new tests because the WebCore change is just a change to how we #include things. * inspector/InspectorPageAgent.h: * inspector/TimelineRecordFactory.h: * workers/Worker.h: * workers/WorkerGlobalScopeProxy.h: * workers/WorkerMessagingProxy.h: Source/WebKitLegacy: * Storage/StorageTracker.h: Source/WTF: Created a Thread::yield() abstraction for sched_yield(), and made WTF use it everywhere that it had previously used std::this_thread::yield(). To make it less annoying to experiment with changes to the lock algorithm in the future, this also moves the meat of the algorithm into LockAlgorithmInlines.h. Only two files include that header. Since LockAlgorithm.h no longer includes ParkingLot.h, a bunch of files in WK now need to include timing headers (Seconds, MonotonicTime, etc) manually. * WTF.xcodeproj/project.pbxproj: * benchmarks/ToyLocks.h: * wtf/CMakeLists.txt: * wtf/Lock.cpp: * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockSlow): Deleted. (WTF::LockAlgorithm::unlockSlow): Deleted. * wtf/LockAlgorithmInlines.h: Added. (WTF::hasParkedBit>::lockSlow): (WTF::hasParkedBit>::unlockSlow): * wtf/MainThread.cpp: * wtf/RunLoopTimer.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPthreads.cpp: (WTF::Thread::yield): * wtf/ThreadingWin.cpp: (WTF::Thread::yield): * wtf/WordLock.cpp: (WTF::WordLockBase::lockSlow): (WTF::WordLockBase::unlockSlow): Canonical link: https://commits.webkit.org/191572@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@219763 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-07-22 14:36:18 +00:00
return DirectHandoff;
}
lock.transaction(
[&] (LockType& value) -> bool {
value &= ~mask;
GC constraint solving should be parallel https://bugs.webkit.org/show_bug.cgi?id=179934 Reviewed by JF Bastien. PerformanceTests: Added a version of splay that measures latency in a way that run-jsc-benchmarks groks. * Octane/splay.js: Added. (this.Setup.setup.setup): (this.TearDown.tearDown.tearDown): (Benchmark): (BenchmarkResult): (BenchmarkResult.prototype.valueOf): (BenchmarkSuite): (alert): (Math.random): (BenchmarkSuite.ResetRNG): (RunStep): (BenchmarkSuite.RunSuites): (BenchmarkSuite.CountBenchmarks): (BenchmarkSuite.GeometricMean): (BenchmarkSuite.GeometricMeanTime): (BenchmarkSuite.AverageAbovePercentile): (BenchmarkSuite.GeometricMeanLatency): (BenchmarkSuite.FormatScore): (BenchmarkSuite.prototype.NotifyStep): (BenchmarkSuite.prototype.NotifyResult): (BenchmarkSuite.prototype.NotifyError): (BenchmarkSuite.prototype.RunSingleBenchmark): (RunNextSetup): (RunNextBenchmark): (RunNextTearDown): (BenchmarkSuite.prototype.RunStep): (GeneratePayloadTree): (GenerateKey): (SplayUpdateStats): (InsertNewNode): (SplaySetup): (SplayTearDown): (SplayRun): (SplayTree): (SplayTree.prototype.isEmpty): (SplayTree.prototype.insert): (SplayTree.prototype.remove): (SplayTree.prototype.find): (SplayTree.prototype.findMax): (SplayTree.prototype.findGreatestLessThan): (SplayTree.prototype.exportKeys): (SplayTree.prototype.splay_): (SplayTree.Node): (SplayTree.Node.prototype.traverse_): (report): (start): Source/JavaScriptCore: This makes it possible to do constraint solving in parallel. This looks like a 1% Speedometer speed-up. It's more than 1% on trunk-Speedometer. The constraint solver supports running constraints in parallel in two different ways: - Run multiple constraints in parallel to each other. This only works for constraints that can tolerate other constraints running concurrently to them (constraint.concurrency() == ConstraintConcurrency::Concurrent). This is the most basic kind of parallelism that the constraint solver supports. All constraints except the JSC SPI constraints are concurrent. We could probably make them concurrent, but I'm playing it safe for now. - A constraint can create parallel work for itself, which the constraint solver will interleave with other stuff. A constraint can report that it has parallel work by returning ConstraintParallelism::Parallel from its executeImpl() function. Then the solver will allow that constraint's doParallelWorkImpl() function to run on as many GC marker threads as are available, for as long as that function wants to run. It's not possible to have a non-concurrent constraint that creates parallel work. The parallelism is implemented in terms of the existing GC marker threads. This turns out to be most natural for two reasons: - No need to start any other threads. - The constraints all want to be passed a SlotVisitor. Running on the marker threads means having access to those threads' SlotVisitors. Also, it means less load balancing. The solver will create work on each marking thread's SlotVisitor. When the solver is done "stealing" a marker thread, that thread will have work it can start doing immediately. Before this change, we had to contribute the work found by the constraint solver to the global worklist so that it could be distributed to the marker threads by load balancing. This change probably helps to avoid that load balancing step. A lot of this change is about making it easy to iterate GC data structures in parallel. This change makes almost all constraints parallel-enabled, but only the DOM's output constraint uses the parallel work API. That constraint iterates the marked cells in two subspaces. This change makes it very easy to compose parallel iterators over subspaces, allocators, blocks, and cells. The marked cell parallel iterator is composed out of parallel iterators for the others. A parallel iterator is just an iterator that can do an atomic next() very quickly. We abstract them using RefPtr<SharedTask<...()>>, where ... is the type returned from the iterator. We know it's done when it returns a falsish version of ... (in the current code, that's always a pointer type, so done is indicated by null). * API/JSMarkingConstraintPrivate.cpp: (JSContextGroupAddMarkingConstraint): * API/JSVirtualMachine.mm: (scanExternalObjectGraph): (scanExternalRememberedSet): * JavaScriptCore.xcodeproj/project.pbxproj: * Sources.txt: * bytecode/AccessCase.cpp: (JSC::AccessCase::propagateTransitions const): * bytecode/CodeBlock.cpp: (JSC::CodeBlock::visitWeakly): (JSC::CodeBlock::shouldJettisonDueToOldAge): (JSC::shouldMarkTransition): (JSC::CodeBlock::propagateTransitions): (JSC::CodeBlock::determineLiveness): * dfg/DFGWorklist.cpp: * ftl/FTLCompile.cpp: (JSC::FTL::compile): * heap/ConstraintParallelism.h: Added. (WTF::printInternal): * heap/Heap.cpp: (JSC::Heap::Heap): (JSC::Heap::addToRememberedSet): (JSC::Heap::runFixpointPhase): (JSC::Heap::stopThePeriphery): (JSC::Heap::resumeThePeriphery): (JSC::Heap::addCoreConstraints): (JSC::Heap::setBonusVisitorTask): (JSC::Heap::runTaskInParallel): (JSC::Heap::forEachSlotVisitor): Deleted. * heap/Heap.h: (JSC::Heap::worldIsRunning const): (JSC::Heap::runFunctionInParallel): * heap/HeapInlines.h: (JSC::Heap::worldIsStopped const): (JSC::Heap::isMarked): (JSC::Heap::incrementDeferralDepth): (JSC::Heap::decrementDeferralDepth): (JSC::Heap::decrementDeferralDepthAndGCIfNeeded): (JSC::Heap::forEachSlotVisitor): (JSC::Heap::collectorBelievesThatTheWorldIsStopped const): Deleted. (JSC::Heap::isMarkedConcurrently): Deleted. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::appendNode): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): Deleted. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::parallelNotEmptyBlockSource): * heap/MarkedAllocator.h: * heap/MarkedBlock.h: (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::areMarksStaleWithDependency): Deleted. (JSC::MarkedBlock::isMarkedConcurrently): Deleted. * heap/MarkedSpace.h: (JSC::MarkedSpace::activeWeakSetsBegin): (JSC::MarkedSpace::activeWeakSetsEnd): (JSC::MarkedSpace::newActiveWeakSetsBegin): (JSC::MarkedSpace::newActiveWeakSetsEnd): * heap/MarkingConstraint.cpp: (JSC::MarkingConstraint::MarkingConstraint): (JSC::MarkingConstraint::execute): (JSC::MarkingConstraint::quickWorkEstimate): (JSC::MarkingConstraint::workEstimate): (JSC::MarkingConstraint::doParallelWork): (JSC::MarkingConstraint::finishParallelWork): (JSC::MarkingConstraint::doParallelWorkImpl): (JSC::MarkingConstraint::finishParallelWorkImpl): * heap/MarkingConstraint.h: (JSC::MarkingConstraint::lastExecuteParallelism const): (JSC::MarkingConstraint::parallelism const): (JSC::MarkingConstraint::quickWorkEstimate): Deleted. (JSC::MarkingConstraint::workEstimate): Deleted. * heap/MarkingConstraintSet.cpp: (JSC::MarkingConstraintSet::MarkingConstraintSet): (JSC::MarkingConstraintSet::add): (JSC::MarkingConstraintSet::executeConvergence): (JSC::MarkingConstraintSet::executeConvergenceImpl): (JSC::MarkingConstraintSet::executeAll): (JSC::MarkingConstraintSet::ExecutionContext::ExecutionContext): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didVisitSomething const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::shouldTimeOut const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::drain): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didExecute const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::execute): Deleted. (): Deleted. * heap/MarkingConstraintSet.h: * heap/MarkingConstraintSolver.cpp: Added. (JSC::MarkingConstraintSolver::MarkingConstraintSolver): (JSC::MarkingConstraintSolver::~MarkingConstraintSolver): (JSC::MarkingConstraintSolver::didVisitSomething const): (JSC::MarkingConstraintSolver::execute): (JSC::MarkingConstraintSolver::drain): (JSC::MarkingConstraintSolver::converge): (JSC::MarkingConstraintSolver::runExecutionThread): (JSC::MarkingConstraintSolver::didExecute): * heap/MarkingConstraintSolver.h: Added. * heap/OpaqueRootSet.h: Removed. * heap/ParallelSourceAdapter.h: Added. (JSC::ParallelSourceAdapter::ParallelSourceAdapter): (JSC::createParallelSourceAdapter): * heap/SimpleMarkingConstraint.cpp: Added. (JSC::SimpleMarkingConstraint::SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::~SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::quickWorkEstimate): (JSC::SimpleMarkingConstraint::executeImpl): * heap/SimpleMarkingConstraint.h: Added. * heap/SlotVisitor.cpp: (JSC::SlotVisitor::didStartMarking): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::updateMutatorIsStopped): (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate const): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::performIncrementOfDraining): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): (JSC::SlotVisitor::waitForTermination): (JSC::SlotVisitor::addOpaqueRoot): Deleted. (JSC::SlotVisitor::containsOpaqueRoot const): Deleted. (JSC::SlotVisitor::containsOpaqueRootTriState const): Deleted. (JSC::SlotVisitor::mergeIfNecessary): Deleted. (JSC::SlotVisitor::mergeOpaqueRootsIfProfitable): Deleted. (JSC::SlotVisitor::mergeOpaqueRoots): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::addOpaqueRoot): (JSC::SlotVisitor::containsOpaqueRoot const): (JSC::SlotVisitor::vm): (JSC::SlotVisitor::vm const): * heap/Subspace.cpp: (JSC::Subspace::parallelAllocatorSource): (JSC::Subspace::parallelNotEmptyMarkedBlockSource): * heap/Subspace.h: * heap/SubspaceInlines.h: (JSC::Subspace::forEachMarkedCellInParallel): * heap/VisitCounter.h: Added. (JSC::VisitCounter::VisitCounter): (JSC::VisitCounter::visitCount const): * heap/VisitingTimeout.h: Removed. * heap/WeakBlock.cpp: (JSC::WeakBlock::specializedVisit): * runtime/Structure.cpp: (JSC::Structure::isCheapDuringGC): (JSC::Structure::markIfCheap): Source/WebCore: No new tests because no change in behavior. This change is best tested using DOM-GC-intensive benchmarks like Speedometer and Dromaeo. This parallelizes the DOM's output constraint, and makes some small changes to make this more scalable. * ForwardingHeaders/heap/SimpleMarkingConstraint.h: Added. * ForwardingHeaders/heap/VisitingTimeout.h: Removed. * Sources.txt: * WebCore.xcodeproj/project.pbxproj: * bindings/js/DOMGCOutputConstraint.cpp: Added. (WebCore::DOMGCOutputConstraint::DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::~DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::executeImpl): (WebCore::DOMGCOutputConstraint::doParallelWorkImpl): (WebCore::DOMGCOutputConstraint::finishParallelWorkImpl): * bindings/js/DOMGCOutputConstraint.h: Added. * bindings/js/WebCoreJSClientData.cpp: (WebCore::JSVMClientData::initNormalWorld): * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): Source/WTF: This does some changes to make it easier to do parallel constraint solving: - I finally removed dependencyWith. This was a silly construct whose only purpose is to confuse people about what it means to have a dependency chain. I took that as an opportunity to grealy simplify the GC's use of dependency chaining. - Added more logic to Deque<>, since I use it for part of the load balancer. - Made it possible to profile lock contention. See https://bugs.webkit.org/show_bug.cgi?id=180250#c0 for some preliminary measurements. - Introduced holdLockIf, which makes it easy to perform predicated lock acquisition. We use that to pick a lock in WebCore. - Introduced CountingLock. It's like WTF::Lock except it also enables optimistic read transactions sorta like Java's StampedLock. * WTF.xcodeproj/project.pbxproj: * wtf/Atomics.h: (WTF::dependency): (WTF::DependencyWith::DependencyWith): Deleted. (WTF::dependencyWith): Deleted. * wtf/BitVector.h: (WTF::BitVector::iterator::operator++): * wtf/CMakeLists.txt: * wtf/ConcurrentPtrHashSet.cpp: Added. (WTF::ConcurrentPtrHashSet::ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::~ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::deleteOldTables): (WTF::ConcurrentPtrHashSet::clear): (WTF::ConcurrentPtrHashSet::initialize): (WTF::ConcurrentPtrHashSet::addSlow): (WTF::ConcurrentPtrHashSet::resizeIfNecessary): (WTF::ConcurrentPtrHashSet::resizeAndAdd): (WTF::ConcurrentPtrHashSet::Table::create): * wtf/ConcurrentPtrHashSet.h: Added. (WTF::ConcurrentPtrHashSet::contains): (WTF::ConcurrentPtrHashSet::add): (WTF::ConcurrentPtrHashSet::size const): (WTF::ConcurrentPtrHashSet::Table::maxLoad const): (WTF::ConcurrentPtrHashSet::hash): (WTF::ConcurrentPtrHashSet::cast): (WTF::ConcurrentPtrHashSet::containsImpl const): (WTF::ConcurrentPtrHashSet::addImpl): * wtf/Deque.h: (WTF::inlineCapacity>::takeFirst): * wtf/FastMalloc.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): * wtf/Locker.h: (WTF::holdLockIf): * wtf/ScopedLambda.h: * wtf/SharedTask.h: (WTF::SharedTask<PassedResultType): (WTF::SharedTask<ResultType): Deleted. * wtf/StackShot.h: Added. (WTF::StackShot::StackShot): (WTF::StackShot::operator=): (WTF::StackShot::array const): (WTF::StackShot::size const): (WTF::StackShot::operator bool const): (WTF::StackShot::operator== const): (WTF::StackShot::hash const): (WTF::StackShot::isHashTableDeletedValue const): (WTF::StackShot::operator> const): (WTF::StackShot::deletedValueArray): (WTF::StackShotHash::hash): (WTF::StackShotHash::equal): * wtf/StackShotProfiler.h: Added. (WTF::StackShotProfiler::StackShotProfiler): (WTF::StackShotProfiler::profile): (WTF::StackShotProfiler::run): Tools: * Scripts/run-jsc-benchmarks: Add splay-latency test, since this change needed to be carefully validated with that benchmark. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/ConcurrentPtrHashSet.cpp: Added. This has unit tests of the new concurrent data structure. The tests focus on correctness under serial execution, which appears to be enough for now (it's so easy to catch a concurrency bug by just running the GC). (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/196360@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225524 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-05 17:53:57 +00:00
value = Hooks::unlockHook(value);
It should be easy to decide how WebKit yields https://bugs.webkit.org/show_bug.cgi?id=174298 Reviewed by Saam Barati. Source/bmalloc: Use sched_yield() explicitly. * bmalloc/StaticMutex.cpp: (bmalloc::StaticMutex::lockSlowCase): Source/JavaScriptCore: Use the new WTF::Thread::yield() function for yielding instead of the C++ function. * heap/Heap.cpp: (JSC::Heap::resumeThePeriphery): * heap/VisitingTimeout.h: * runtime/JSCell.cpp: (JSC::JSCell::lockSlow): (JSC::JSCell::unlockSlow): * runtime/JSCell.h: * runtime/JSCellInlines.h: (JSC::JSCell::lock): (JSC::JSCell::unlock): * runtime/JSLock.cpp: (JSC::JSLock::grabAllLocks): * runtime/SamplingProfiler.cpp: Source/WebCore: No new tests because the WebCore change is just a change to how we #include things. * inspector/InspectorPageAgent.h: * inspector/TimelineRecordFactory.h: * workers/Worker.h: * workers/WorkerGlobalScopeProxy.h: * workers/WorkerMessagingProxy.h: Source/WebKitLegacy: * Storage/StorageTracker.h: Source/WTF: Created a Thread::yield() abstraction for sched_yield(), and made WTF use it everywhere that it had previously used std::this_thread::yield(). To make it less annoying to experiment with changes to the lock algorithm in the future, this also moves the meat of the algorithm into LockAlgorithmInlines.h. Only two files include that header. Since LockAlgorithm.h no longer includes ParkingLot.h, a bunch of files in WK now need to include timing headers (Seconds, MonotonicTime, etc) manually. * WTF.xcodeproj/project.pbxproj: * benchmarks/ToyLocks.h: * wtf/CMakeLists.txt: * wtf/Lock.cpp: * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockSlow): Deleted. (WTF::LockAlgorithm::unlockSlow): Deleted. * wtf/LockAlgorithmInlines.h: Added. (WTF::hasParkedBit>::lockSlow): (WTF::hasParkedBit>::unlockSlow): * wtf/MainThread.cpp: * wtf/RunLoopTimer.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPthreads.cpp: (WTF::Thread::yield): * wtf/ThreadingWin.cpp: (WTF::Thread::yield): * wtf/WordLock.cpp: (WTF::WordLockBase::lockSlow): (WTF::WordLockBase::unlockSlow): Canonical link: https://commits.webkit.org/191572@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@219763 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-07-22 14:36:18 +00:00
if (result.mayHaveMoreThreads)
value |= hasParkedBit;
return true;
});
return BargingOpportunity;
});
return;
}
}
} // namespace WTF