haikuwebkit/Source/WTF/wtf/Lock.h

209 lines
7.2 KiB
C
Raw Permalink Normal View History

Lightweight locks should be adaptive https://bugs.webkit.org/show_bug.cgi?id=147545 Reviewed by Geoffrey Garen. Source/JavaScriptCore: * dfg/DFGCommon.cpp: (JSC::DFG::startCrashing): * heap/CopiedBlock.h: (JSC::CopiedBlock::workListLock): * heap/CopiedBlockInlines.h: (JSC::CopiedBlock::shouldReportLiveBytes): (JSC::CopiedBlock::reportLiveBytes): * heap/CopiedSpace.cpp: (JSC::CopiedSpace::doneFillingBlock): * heap/CopiedSpace.h: (JSC::CopiedSpace::CopiedGeneration::CopiedGeneration): * heap/CopiedSpaceInlines.h: (JSC::CopiedSpace::recycleEvacuatedBlock): * heap/GCThreadSharedData.cpp: (JSC::GCThreadSharedData::didStartCopying): * heap/GCThreadSharedData.h: (JSC::GCThreadSharedData::getNextBlocksToCopy): * heap/ListableHandler.h: (JSC::ListableHandler::List::addThreadSafe): (JSC::ListableHandler::List::addNotThreadSafe): * heap/MachineStackMarker.cpp: (JSC::MachineThreads::tryCopyOtherThreadStacks): * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::copyLater): * parser/SourceProvider.cpp: (JSC::SourceProvider::~SourceProvider): (JSC::SourceProvider::getID): * profiler/ProfilerDatabase.cpp: (JSC::Profiler::Database::addDatabaseToAtExit): (JSC::Profiler::Database::removeDatabaseFromAtExit): (JSC::Profiler::Database::removeFirstAtExitDatabase): * runtime/TypeProfilerLog.h: Source/WebCore: * bindings/objc/WebScriptObject.mm: (WebCore::getJSWrapper): (WebCore::addJSWrapper): (WebCore::removeJSWrapper): (WebCore::removeJSWrapperIfRetainCountOne): * platform/audio/mac/CARingBuffer.cpp: (WebCore::CARingBuffer::setCurrentFrameBounds): (WebCore::CARingBuffer::getCurrentFrameBounds): * platform/audio/mac/CARingBuffer.h: * platform/ios/wak/WAKWindow.mm: (-[WAKWindow setExposedScrollViewRect:]): (-[WAKWindow exposedScrollViewRect]): Source/WebKit2: * WebProcess/WebPage/EventDispatcher.cpp: (WebKit::EventDispatcher::clearQueuedTouchEventsForPage): (WebKit::EventDispatcher::getQueuedTouchEventsForPage): (WebKit::EventDispatcher::touchEvent): (WebKit::EventDispatcher::dispatchTouchEvents): * WebProcess/WebPage/EventDispatcher.h: * WebProcess/WebPage/ViewUpdateDispatcher.cpp: (WebKit::ViewUpdateDispatcher::visibleContentRectUpdate): (WebKit::ViewUpdateDispatcher::dispatchVisibleContentRectUpdate): * WebProcess/WebPage/ViewUpdateDispatcher.h: Source/WTF: A common idiom in WebKit is to use spinlocks. We use them because the lock acquisition overhead is lower than system locks and because they take dramatically less space than system locks. The speed and space advantages of spinlocks can be astonishing: an uncontended spinlock acquire is up to 10x faster and under microcontention - short critical section with two or more threads taking turns - spinlocks are up to 100x faster. Spinlocks take only 1 byte or 4 bytes depending on the flavor, while system locks take 64 bytes or more. Clearly, WebKit should continue to avoid system locks - they are just far too slow and far too big. But there is a problem with this idiom. System lock implementations will sleep a thread when it attempts to acquire a lock that is held, while spinlocks will cause the thread to burn CPU. In WebKit spinlocks, the thread will repeatedly call sched_yield(). This is awesome for microcontention, but awful when the lock will not be released for a while. In fact, when critical sections take tens of microseconds or more, the CPU time cost of our spinlocks is almost 100x more than the CPU time cost of a system lock. This case doesn't arise too frequently in our current uses of spinlocks, but that's probably because right now there are places where we make a conscious decision to use system locks - even though they use more memory and are slower - because we don't want to waste CPU cycles when a thread has to wait a while to acquire the lock. The solution is to just implement a modern adaptive mutex in WTF. Luckily, this isn't a new concept. This patch implements a mutex that is reminiscent of the kinds of low-overhead locks that JVMs use. The actual implementation here is inspired by some of the ideas from [1]. The idea is simple: the fast path is an inlined CAS to immediately acquire a lock that isn't held, the slow path tries some number of spins to acquire the lock, and if that fails, the thread is put on a queue and put to sleep. The queue is made up of statically allocated thread nodes and the lock itself is a tagged pointer: either it is just bits telling us the complete lock state (not held or held) or it is a pointer to the head of a queue of threads waiting to acquire the lock. This approach gives WTF::Lock three different levels of adaptation: an inlined fast path if the lock is not contended, a short burst of spinning for microcontention, and a full-blown queue for critical sections that are held for a long time. On a locking microbenchmark, this new Lock exhibits the following performance characteristics: - Lock+unlock on an uncontended no-op critical section: 2x slower than SpinLock and 3x faster than a system mutex. - Lock+unlock on a contended no-op critical section: 2x slower than SpinLock and 100x faster than a system mutex. - CPU time spent in lock() on a lock held for a while: same as system mutex, 90x less than a SpinLock. - Memory usage: sizeof(void*), so on 64-bit it's 8x less than a system mutex but 2x worse than a SpinLock. This patch replaces all uses of SpinLock with Lock, since our critical sections are not no-ops so if you do basically anything in your critical section, the Lock overhead will be invisible. Also, in all places where we used SpinLock, we could tolerate 8 bytes of overhead instead of 4. Performance benchmarking using JSC macrobenchmarks shows no difference, which is as it should be: the purpose of this change is to reduce CPU time wasted, not wallclock time. This patch doesn't replace any uses of ByteSpinLock, since we expect that the space benefits of having a lock that just uses a byte are still better than the CPU wastage benefits of Lock. But, this work will enable some future work to create locks that will fit in just 1.6 bits: https://bugs.webkit.org/show_bug.cgi?id=147665. Rolling this back in after fixing Lock::unlockSlow() for architectures that have a truly weak CAS. Since the Lock::unlock() fast path can go to slow path spuriously, it may go there even if there aren't any threads on the Lock's queue. So, unlockSlow() must be able to deal with the possibility of a null queue head. [1] http://www.filpizlo.com/papers/pizlo-pppj2011-fable.pdf * WTF.vcxproj/WTF.vcxproj: * WTF.xcodeproj/project.pbxproj: * benchmarks: Added. * benchmarks/LockSpeedTest.cpp: Added. (main): * wtf/Atomics.h: (WTF::Atomic::compareExchangeWeak): (WTF::Atomic::compareExchangeStrong): * wtf/CMakeLists.txt: * wtf/Lock.cpp: Added. (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): * wtf/Lock.h: Added. (WTF::LockBase::lock): (WTF::LockBase::unlock): (WTF::LockBase::isHeld): (WTF::LockBase::isLocked): (WTF::Lock::Lock): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocatorHandle::shrink): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): (WTF::MetaAllocator::addFreshFreeSpace): (WTF::MetaAllocator::debugFreeSpaceSize): * wtf/MetaAllocator.h: * wtf/SpinLock.h: * wtf/ThreadingPthreads.cpp: * wtf/ThreadingWin.cpp: * wtf/text/AtomicString.cpp: * wtf/text/AtomicStringImpl.cpp: (WTF::AtomicStringTableLocker::AtomicStringTableLocker): Tools: * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.vcxproj/TestWebKitAPI.vcxproj: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Lock.cpp: Added. (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/165908@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188169 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-07 22:38:59 +00:00
/*
Use constexpr instead of const in symbol definitions that are obviously constexpr. https://bugs.webkit.org/show_bug.cgi?id=201879 Rubber-stamped by Joseph Pecoraro. Source/bmalloc: * bmalloc/AvailableMemory.cpp: * bmalloc/IsoTLS.h: * bmalloc/Map.h: * bmalloc/Mutex.cpp: (bmalloc::Mutex::lockSlowCase): * bmalloc/PerThread.h: * bmalloc/Vector.h: * bmalloc/Zone.h: Source/JavaScriptCore: const may require external storage (at the compiler's whim) though these currently do not. constexpr makes it clear that the value is a literal constant that can be inlined. In most cases in the code, when we say static const, we actually mean static constexpr. I'm changing the code to reflect this. * API/JSAPIValueWrapper.h: * API/JSCallbackConstructor.h: * API/JSCallbackObject.h: * API/JSContextRef.cpp: * API/JSWrapperMap.mm: * API/tests/CompareAndSwapTest.cpp: * API/tests/TypedArrayCTest.cpp: * API/tests/testapi.mm: (testObjectiveCAPIMain): * KeywordLookupGenerator.py: (Trie.printAsC): * assembler/ARMv7Assembler.h: * assembler/AssemblerBuffer.h: * assembler/AssemblerCommon.h: * assembler/MacroAssembler.h: * assembler/MacroAssemblerARM64.h: * assembler/MacroAssemblerARM64E.h: * assembler/MacroAssemblerARMv7.h: * assembler/MacroAssemblerCodeRef.h: * assembler/MacroAssemblerMIPS.h: * assembler/MacroAssemblerX86.h: * assembler/MacroAssemblerX86Common.h: (JSC::MacroAssemblerX86Common::absDouble): (JSC::MacroAssemblerX86Common::negateDouble): * assembler/MacroAssemblerX86_64.h: * assembler/X86Assembler.h: * b3/B3Bank.h: * b3/B3CheckSpecial.h: * b3/B3DuplicateTails.cpp: * b3/B3EliminateCommonSubexpressions.cpp: * b3/B3FixSSA.cpp: * b3/B3FoldPathConstants.cpp: * b3/B3InferSwitches.cpp: * b3/B3Kind.h: * b3/B3LowerToAir.cpp: * b3/B3NativeTraits.h: * b3/B3ReduceDoubleToFloat.cpp: * b3/B3ReduceLoopStrength.cpp: * b3/B3ReduceStrength.cpp: * b3/B3ValueKey.h: * b3/air/AirAllocateRegistersByGraphColoring.cpp: * b3/air/AirAllocateStackByGraphColoring.cpp: * b3/air/AirArg.h: * b3/air/AirCCallSpecial.h: * b3/air/AirEmitShuffle.cpp: * b3/air/AirFixObviousSpills.cpp: * b3/air/AirFormTable.h: * b3/air/AirLowerAfterRegAlloc.cpp: * b3/air/AirPrintSpecial.h: * b3/air/AirStackAllocation.cpp: * b3/air/AirTmp.h: * b3/testb3_6.cpp: (testInterpreter): * bytecode/AccessCase.cpp: * bytecode/CallLinkStatus.cpp: * bytecode/CallVariant.h: * bytecode/CodeBlock.h: * bytecode/CodeOrigin.h: * bytecode/DFGExitProfile.h: * bytecode/DirectEvalCodeCache.h: * bytecode/ExecutableToCodeBlockEdge.h: * bytecode/GetterSetterAccessCase.cpp: * bytecode/LazyOperandValueProfile.h: * bytecode/ObjectPropertyCondition.h: * bytecode/ObjectPropertyConditionSet.cpp: * bytecode/PolymorphicAccess.cpp: * bytecode/PropertyCondition.h: * bytecode/SpeculatedType.h: * bytecode/StructureStubInfo.cpp: * bytecode/UnlinkedCodeBlock.cpp: (JSC::UnlinkedCodeBlock::typeProfilerExpressionInfoForBytecodeOffset): * bytecode/UnlinkedCodeBlock.h: * bytecode/UnlinkedEvalCodeBlock.h: * bytecode/UnlinkedFunctionCodeBlock.h: * bytecode/UnlinkedFunctionExecutable.h: * bytecode/UnlinkedModuleProgramCodeBlock.h: * bytecode/UnlinkedProgramCodeBlock.h: * bytecode/ValueProfile.h: * bytecode/VirtualRegister.h: * bytecode/Watchpoint.h: * bytecompiler/BytecodeGenerator.h: * bytecompiler/Label.h: * bytecompiler/NodesCodegen.cpp: (JSC::ThisNode::emitBytecode): * bytecompiler/RegisterID.h: * debugger/Breakpoint.h: * debugger/DebuggerParseData.cpp: * debugger/DebuggerPrimitives.h: * debugger/DebuggerScope.h: * dfg/DFGAbstractHeap.h: * dfg/DFGAbstractValue.h: * dfg/DFGArgumentsEliminationPhase.cpp: * dfg/DFGByteCodeParser.cpp: * dfg/DFGCSEPhase.cpp: * dfg/DFGCommon.h: * dfg/DFGCompilationKey.h: * dfg/DFGDesiredGlobalProperty.h: * dfg/DFGEdgeDominates.h: * dfg/DFGEpoch.h: * dfg/DFGForAllKills.h: (JSC::DFG::forAllKilledNodesAtNodeIndex): * dfg/DFGGraph.cpp: (JSC::DFG::Graph::isLiveInBytecode): * dfg/DFGHeapLocation.h: * dfg/DFGInPlaceAbstractState.cpp: * dfg/DFGIntegerCheckCombiningPhase.cpp: * dfg/DFGIntegerRangeOptimizationPhase.cpp: * dfg/DFGInvalidationPointInjectionPhase.cpp: * dfg/DFGLICMPhase.cpp: * dfg/DFGLazyNode.h: * dfg/DFGMinifiedID.h: * dfg/DFGMovHintRemovalPhase.cpp: * dfg/DFGNodeFlowProjection.h: * dfg/DFGNodeType.h: * dfg/DFGObjectAllocationSinkingPhase.cpp: * dfg/DFGPhantomInsertionPhase.cpp: * dfg/DFGPromotedHeapLocation.h: * dfg/DFGPropertyTypeKey.h: * dfg/DFGPureValue.h: * dfg/DFGPutStackSinkingPhase.cpp: * dfg/DFGRegisterBank.h: * dfg/DFGSSAConversionPhase.cpp: * dfg/DFGSSALoweringPhase.cpp: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::compileDoubleRep): (JSC::DFG::compileClampDoubleToByte): (JSC::DFG::SpeculativeJIT::compileArithRounding): (JSC::DFG::compileArithPowIntegerFastPath): (JSC::DFG::SpeculativeJIT::compileArithPow): (JSC::DFG::SpeculativeJIT::emitBinarySwitchStringRecurse): * dfg/DFGStackLayoutPhase.cpp: * dfg/DFGStoreBarrierInsertionPhase.cpp: * dfg/DFGStrengthReductionPhase.cpp: * dfg/DFGStructureAbstractValue.h: * dfg/DFGVarargsForwardingPhase.cpp: * dfg/DFGVariableEventStream.cpp: (JSC::DFG::VariableEventStream::reconstruct const): * dfg/DFGWatchpointCollectionPhase.cpp: * disassembler/ARM64/A64DOpcode.h: * ftl/FTLLocation.h: * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compileArithRandom): * ftl/FTLSlowPathCall.cpp: * ftl/FTLSlowPathCallKey.h: * heap/CellContainer.h: * heap/CellState.h: * heap/ConservativeRoots.h: * heap/GCSegmentedArray.h: * heap/HandleBlock.h: * heap/Heap.cpp: (JSC::Heap::updateAllocationLimits): * heap/Heap.h: * heap/HeapSnapshot.h: * heap/HeapUtil.h: (JSC::HeapUtil::findGCObjectPointersForMarking): * heap/IncrementalSweeper.cpp: * heap/LargeAllocation.h: * heap/MarkedBlock.cpp: * heap/Strong.h: * heap/VisitRaceKey.h: * heap/Weak.h: * heap/WeakBlock.h: * inspector/JSInjectedScriptHost.h: * inspector/JSInjectedScriptHostPrototype.h: * inspector/JSJavaScriptCallFrame.h: * inspector/JSJavaScriptCallFramePrototype.h: * inspector/agents/InspectorConsoleAgent.cpp: * inspector/agents/InspectorRuntimeAgent.cpp: (Inspector::InspectorRuntimeAgent::getRuntimeTypesForVariablesAtOffsets): * inspector/scripts/codegen/generate_cpp_protocol_types_header.py: (CppProtocolTypesHeaderGenerator._generate_versions): * inspector/scripts/tests/generic/expected/version.json-result: * interpreter/Interpreter.h: * interpreter/ShadowChicken.cpp: * jit/BinarySwitch.cpp: * jit/CallFrameShuffler.h: * jit/ExecutableAllocator.h: * jit/FPRInfo.h: * jit/GPRInfo.h: * jit/ICStats.h: * jit/JITThunks.h: * jit/Reg.h: * jit/RegisterSet.h: * jit/TempRegisterSet.h: * jsc.cpp: * parser/ASTBuilder.h: * parser/Nodes.h: * parser/SourceCodeKey.h: * parser/SyntaxChecker.h: * parser/VariableEnvironment.h: * profiler/ProfilerOrigin.h: * profiler/ProfilerOriginStack.h: * profiler/ProfilerUID.h: * runtime/AbstractModuleRecord.cpp: * runtime/ArrayBufferNeuteringWatchpointSet.h: * runtime/ArrayConstructor.h: * runtime/ArrayConventions.h: * runtime/ArrayIteratorPrototype.h: * runtime/ArrayPrototype.cpp: (JSC::setLength): * runtime/AsyncFromSyncIteratorPrototype.h: * runtime/AsyncGeneratorFunctionPrototype.h: * runtime/AsyncGeneratorPrototype.h: * runtime/AsyncIteratorPrototype.h: * runtime/AtomicsObject.cpp: * runtime/BigIntConstructor.h: * runtime/BigIntPrototype.h: * runtime/BooleanPrototype.h: * runtime/ClonedArguments.h: * runtime/CodeCache.h: * runtime/ControlFlowProfiler.h: * runtime/CustomGetterSetter.h: * runtime/DateConstructor.h: * runtime/DatePrototype.h: * runtime/DefinePropertyAttributes.h: * runtime/ErrorPrototype.h: * runtime/EvalExecutable.h: * runtime/Exception.h: * runtime/ExceptionHelpers.cpp: (JSC::invalidParameterInSourceAppender): (JSC::invalidParameterInstanceofSourceAppender): * runtime/ExceptionHelpers.h: * runtime/ExecutableBase.h: * runtime/FunctionExecutable.h: * runtime/FunctionRareData.h: * runtime/GeneratorPrototype.h: * runtime/GenericArguments.h: * runtime/GenericOffset.h: * runtime/GetPutInfo.h: * runtime/GetterSetter.h: * runtime/GlobalExecutable.h: * runtime/Identifier.h: * runtime/InspectorInstrumentationObject.h: * runtime/InternalFunction.h: * runtime/IntlCollatorConstructor.h: * runtime/IntlCollatorPrototype.h: * runtime/IntlDateTimeFormatConstructor.h: * runtime/IntlDateTimeFormatPrototype.h: * runtime/IntlNumberFormatConstructor.h: * runtime/IntlNumberFormatPrototype.h: * runtime/IntlObject.h: * runtime/IntlPluralRulesConstructor.h: * runtime/IntlPluralRulesPrototype.h: * runtime/IteratorPrototype.h: * runtime/JSArray.cpp: (JSC::JSArray::tryCreateUninitializedRestricted): * runtime/JSArray.h: * runtime/JSArrayBuffer.h: * runtime/JSArrayBufferView.h: * runtime/JSBigInt.h: * runtime/JSCJSValue.h: * runtime/JSCell.h: * runtime/JSCustomGetterSetterFunction.h: * runtime/JSDataView.h: * runtime/JSDataViewPrototype.h: * runtime/JSDestructibleObject.h: * runtime/JSFixedArray.h: * runtime/JSGenericTypedArrayView.h: * runtime/JSGlobalLexicalEnvironment.h: * runtime/JSGlobalObject.h: * runtime/JSImmutableButterfly.h: * runtime/JSInternalPromiseConstructor.h: * runtime/JSInternalPromiseDeferred.h: * runtime/JSInternalPromisePrototype.h: * runtime/JSLexicalEnvironment.h: * runtime/JSModuleEnvironment.h: * runtime/JSModuleLoader.h: * runtime/JSModuleNamespaceObject.h: * runtime/JSNonDestructibleProxy.h: * runtime/JSONObject.cpp: * runtime/JSONObject.h: * runtime/JSObject.h: * runtime/JSPromiseConstructor.h: * runtime/JSPromiseDeferred.h: * runtime/JSPromisePrototype.h: * runtime/JSPropertyNameEnumerator.h: * runtime/JSProxy.h: * runtime/JSScope.h: * runtime/JSScriptFetchParameters.h: * runtime/JSScriptFetcher.h: * runtime/JSSegmentedVariableObject.h: * runtime/JSSourceCode.h: * runtime/JSString.cpp: * runtime/JSString.h: * runtime/JSSymbolTableObject.h: * runtime/JSTemplateObjectDescriptor.h: * runtime/JSTypeInfo.h: * runtime/MapPrototype.h: * runtime/MinimumReservedZoneSize.h: * runtime/ModuleProgramExecutable.h: * runtime/NativeExecutable.h: * runtime/NativeFunction.h: * runtime/NativeStdFunctionCell.h: * runtime/NumberConstructor.h: * runtime/NumberPrototype.h: * runtime/ObjectConstructor.h: * runtime/ObjectPrototype.h: * runtime/ProgramExecutable.h: * runtime/PromiseDeferredTimer.cpp: * runtime/PropertyMapHashTable.h: * runtime/PropertyNameArray.h: (JSC::PropertyNameArray::add): * runtime/PrototypeKey.h: * runtime/ProxyConstructor.h: * runtime/ProxyObject.cpp: (JSC::ProxyObject::performGetOwnPropertyNames): * runtime/ProxyRevoke.h: * runtime/ReflectObject.h: * runtime/RegExp.h: * runtime/RegExpCache.h: * runtime/RegExpConstructor.h: * runtime/RegExpKey.h: * runtime/RegExpObject.h: * runtime/RegExpPrototype.h: * runtime/RegExpStringIteratorPrototype.h: * runtime/SamplingProfiler.cpp: * runtime/ScopedArgumentsTable.h: * runtime/ScriptExecutable.h: * runtime/SetPrototype.h: * runtime/SmallStrings.h: * runtime/SparseArrayValueMap.h: * runtime/StringConstructor.h: * runtime/StringIteratorPrototype.h: * runtime/StringObject.h: * runtime/StringPrototype.h: * runtime/Structure.h: * runtime/StructureChain.h: * runtime/StructureRareData.h: * runtime/StructureTransitionTable.h: * runtime/Symbol.h: * runtime/SymbolConstructor.h: * runtime/SymbolPrototype.h: * runtime/SymbolTable.h: * runtime/TemplateObjectDescriptor.h: * runtime/TypeProfiler.cpp: * runtime/TypeProfiler.h: * runtime/TypeProfilerLog.cpp: * runtime/VarOffset.h: * testRegExp.cpp: * tools/HeapVerifier.cpp: (JSC::HeapVerifier::checkIfRecorded): * tools/JSDollarVM.cpp: * wasm/WasmB3IRGenerator.cpp: * wasm/WasmBBQPlan.cpp: * wasm/WasmFaultSignalHandler.cpp: * wasm/WasmFunctionParser.h: * wasm/WasmOMGForOSREntryPlan.cpp: * wasm/WasmOMGPlan.cpp: * wasm/WasmPlan.cpp: * wasm/WasmSignature.cpp: * wasm/WasmSignature.h: * wasm/WasmWorklist.cpp: * wasm/js/JSWebAssembly.h: * wasm/js/JSWebAssemblyCodeBlock.h: * wasm/js/WebAssemblyCompileErrorConstructor.h: * wasm/js/WebAssemblyCompileErrorPrototype.h: * wasm/js/WebAssemblyFunction.h: * wasm/js/WebAssemblyInstanceConstructor.h: * wasm/js/WebAssemblyInstancePrototype.h: * wasm/js/WebAssemblyLinkErrorConstructor.h: * wasm/js/WebAssemblyLinkErrorPrototype.h: * wasm/js/WebAssemblyMemoryConstructor.h: * wasm/js/WebAssemblyMemoryPrototype.h: * wasm/js/WebAssemblyModuleConstructor.h: * wasm/js/WebAssemblyModulePrototype.h: * wasm/js/WebAssemblyRuntimeErrorConstructor.h: * wasm/js/WebAssemblyRuntimeErrorPrototype.h: * wasm/js/WebAssemblyTableConstructor.h: * wasm/js/WebAssemblyTablePrototype.h: * wasm/js/WebAssemblyToJSCallee.h: * yarr/Yarr.h: * yarr/YarrParser.h: * yarr/generateYarrCanonicalizeUnicode: Source/WebCore: No new tests. Covered by existing tests. * bindings/js/JSDOMConstructorBase.h: * bindings/js/JSDOMWindowProperties.h: * bindings/scripts/CodeGeneratorJS.pm: (GenerateHeader): (GeneratePrototypeDeclaration): * bindings/scripts/test/JS/JSTestActiveDOMObject.h: * bindings/scripts/test/JS/JSTestEnabledBySetting.h: * bindings/scripts/test/JS/JSTestEnabledForContext.h: * bindings/scripts/test/JS/JSTestEventTarget.h: * bindings/scripts/test/JS/JSTestGlobalObject.h: * bindings/scripts/test/JS/JSTestIndexedSetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestIndexedSetterThrowingException.h: * bindings/scripts/test/JS/JSTestIndexedSetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedAndIndexedSetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedAndIndexedSetterThrowingException.h: * bindings/scripts/test/JS/JSTestNamedAndIndexedSetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedDeleterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedDeleterThrowingException.h: * bindings/scripts/test/JS/JSTestNamedDeleterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedDeleterWithIndexedGetter.h: * bindings/scripts/test/JS/JSTestNamedGetterCallWith.h: * bindings/scripts/test/JS/JSTestNamedGetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedGetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedSetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedSetterThrowingException.h: * bindings/scripts/test/JS/JSTestNamedSetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedSetterWithIndexedGetter.h: * bindings/scripts/test/JS/JSTestNamedSetterWithIndexedGetterAndSetter.h: * bindings/scripts/test/JS/JSTestNamedSetterWithOverrideBuiltins.h: * bindings/scripts/test/JS/JSTestNamedSetterWithUnforgableProperties.h: * bindings/scripts/test/JS/JSTestNamedSetterWithUnforgablePropertiesAndOverrideBuiltins.h: * bindings/scripts/test/JS/JSTestObj.h: * bindings/scripts/test/JS/JSTestOverrideBuiltins.h: * bindings/scripts/test/JS/JSTestPluginInterface.h: * bindings/scripts/test/JS/JSTestTypedefs.h: * bridge/objc/objc_runtime.h: * bridge/runtime_array.h: * bridge/runtime_method.h: * bridge/runtime_object.h: Source/WebKit: * WebProcess/Plugins/Netscape/JSNPObject.h: Source/WTF: * wtf/Assertions.cpp: * wtf/AutomaticThread.cpp: * wtf/BitVector.h: * wtf/Bitmap.h: * wtf/BloomFilter.h: * wtf/Brigand.h: * wtf/CheckedArithmetic.h: * wtf/CrossThreadCopier.h: * wtf/CurrentTime.cpp: * wtf/DataLog.cpp: * wtf/DateMath.cpp: (WTF::daysFrom1970ToYear): * wtf/DeferrableRefCounted.h: * wtf/GetPtr.h: * wtf/HashFunctions.h: * wtf/HashMap.h: * wtf/HashTable.h: * wtf/HashTraits.h: * wtf/JSONValues.cpp: * wtf/JSONValues.h: * wtf/ListHashSet.h: * wtf/Lock.h: * wtf/LockAlgorithm.h: * wtf/LockAlgorithmInlines.h: (WTF::Hooks>::lockSlow): * wtf/Logger.h: * wtf/LoggerHelper.h: (WTF::LoggerHelper::childLogIdentifier const): * wtf/MainThread.cpp: * wtf/MetaAllocatorPtr.h: * wtf/MonotonicTime.h: * wtf/NaturalLoops.h: (WTF::NaturalLoops::NaturalLoops): * wtf/ObjectIdentifier.h: * wtf/RAMSize.cpp: * wtf/Ref.h: * wtf/RefPtr.h: * wtf/RetainPtr.h: * wtf/SchedulePair.h: * wtf/StackShot.h: * wtf/StdLibExtras.h: * wtf/TinyPtrSet.h: * wtf/URL.cpp: * wtf/URLHash.h: * wtf/URLParser.cpp: (WTF::URLParser::defaultPortForProtocol): * wtf/Vector.h: * wtf/VectorTraits.h: * wtf/WallTime.h: * wtf/WeakHashSet.h: * wtf/WordLock.h: * wtf/cocoa/CPUTimeCocoa.cpp: * wtf/cocoa/MemoryPressureHandlerCocoa.mm: * wtf/persistence/PersistentDecoder.h: * wtf/persistence/PersistentEncoder.h: * wtf/text/AtomStringHash.h: * wtf/text/CString.h: * wtf/text/StringBuilder.cpp: (WTF::expandedCapacity): * wtf/text/StringHash.h: * wtf/text/StringImpl.h: * wtf/text/StringToIntegerConversion.h: (WTF::toIntegralType): * wtf/text/SymbolRegistry.h: * wtf/text/TextStream.cpp: (WTF::hasFractions): * wtf/text/WTFString.h: * wtf/text/cocoa/TextBreakIteratorInternalICUCocoa.cpp: Canonical link: https://commits.webkit.org/215538@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@250005 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2019-09-18 00:36:19 +00:00
* Copyright (C) 2015-2019 Apple Inc. All rights reserved.
Lightweight locks should be adaptive https://bugs.webkit.org/show_bug.cgi?id=147545 Reviewed by Geoffrey Garen. Source/JavaScriptCore: * dfg/DFGCommon.cpp: (JSC::DFG::startCrashing): * heap/CopiedBlock.h: (JSC::CopiedBlock::workListLock): * heap/CopiedBlockInlines.h: (JSC::CopiedBlock::shouldReportLiveBytes): (JSC::CopiedBlock::reportLiveBytes): * heap/CopiedSpace.cpp: (JSC::CopiedSpace::doneFillingBlock): * heap/CopiedSpace.h: (JSC::CopiedSpace::CopiedGeneration::CopiedGeneration): * heap/CopiedSpaceInlines.h: (JSC::CopiedSpace::recycleEvacuatedBlock): * heap/GCThreadSharedData.cpp: (JSC::GCThreadSharedData::didStartCopying): * heap/GCThreadSharedData.h: (JSC::GCThreadSharedData::getNextBlocksToCopy): * heap/ListableHandler.h: (JSC::ListableHandler::List::addThreadSafe): (JSC::ListableHandler::List::addNotThreadSafe): * heap/MachineStackMarker.cpp: (JSC::MachineThreads::tryCopyOtherThreadStacks): * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::copyLater): * parser/SourceProvider.cpp: (JSC::SourceProvider::~SourceProvider): (JSC::SourceProvider::getID): * profiler/ProfilerDatabase.cpp: (JSC::Profiler::Database::addDatabaseToAtExit): (JSC::Profiler::Database::removeDatabaseFromAtExit): (JSC::Profiler::Database::removeFirstAtExitDatabase): * runtime/TypeProfilerLog.h: Source/WebCore: * bindings/objc/WebScriptObject.mm: (WebCore::getJSWrapper): (WebCore::addJSWrapper): (WebCore::removeJSWrapper): (WebCore::removeJSWrapperIfRetainCountOne): * platform/audio/mac/CARingBuffer.cpp: (WebCore::CARingBuffer::setCurrentFrameBounds): (WebCore::CARingBuffer::getCurrentFrameBounds): * platform/audio/mac/CARingBuffer.h: * platform/ios/wak/WAKWindow.mm: (-[WAKWindow setExposedScrollViewRect:]): (-[WAKWindow exposedScrollViewRect]): Source/WebKit2: * WebProcess/WebPage/EventDispatcher.cpp: (WebKit::EventDispatcher::clearQueuedTouchEventsForPage): (WebKit::EventDispatcher::getQueuedTouchEventsForPage): (WebKit::EventDispatcher::touchEvent): (WebKit::EventDispatcher::dispatchTouchEvents): * WebProcess/WebPage/EventDispatcher.h: * WebProcess/WebPage/ViewUpdateDispatcher.cpp: (WebKit::ViewUpdateDispatcher::visibleContentRectUpdate): (WebKit::ViewUpdateDispatcher::dispatchVisibleContentRectUpdate): * WebProcess/WebPage/ViewUpdateDispatcher.h: Source/WTF: A common idiom in WebKit is to use spinlocks. We use them because the lock acquisition overhead is lower than system locks and because they take dramatically less space than system locks. The speed and space advantages of spinlocks can be astonishing: an uncontended spinlock acquire is up to 10x faster and under microcontention - short critical section with two or more threads taking turns - spinlocks are up to 100x faster. Spinlocks take only 1 byte or 4 bytes depending on the flavor, while system locks take 64 bytes or more. Clearly, WebKit should continue to avoid system locks - they are just far too slow and far too big. But there is a problem with this idiom. System lock implementations will sleep a thread when it attempts to acquire a lock that is held, while spinlocks will cause the thread to burn CPU. In WebKit spinlocks, the thread will repeatedly call sched_yield(). This is awesome for microcontention, but awful when the lock will not be released for a while. In fact, when critical sections take tens of microseconds or more, the CPU time cost of our spinlocks is almost 100x more than the CPU time cost of a system lock. This case doesn't arise too frequently in our current uses of spinlocks, but that's probably because right now there are places where we make a conscious decision to use system locks - even though they use more memory and are slower - because we don't want to waste CPU cycles when a thread has to wait a while to acquire the lock. The solution is to just implement a modern adaptive mutex in WTF. Luckily, this isn't a new concept. This patch implements a mutex that is reminiscent of the kinds of low-overhead locks that JVMs use. The actual implementation here is inspired by some of the ideas from [1]. The idea is simple: the fast path is an inlined CAS to immediately acquire a lock that isn't held, the slow path tries some number of spins to acquire the lock, and if that fails, the thread is put on a queue and put to sleep. The queue is made up of statically allocated thread nodes and the lock itself is a tagged pointer: either it is just bits telling us the complete lock state (not held or held) or it is a pointer to the head of a queue of threads waiting to acquire the lock. This approach gives WTF::Lock three different levels of adaptation: an inlined fast path if the lock is not contended, a short burst of spinning for microcontention, and a full-blown queue for critical sections that are held for a long time. On a locking microbenchmark, this new Lock exhibits the following performance characteristics: - Lock+unlock on an uncontended no-op critical section: 2x slower than SpinLock and 3x faster than a system mutex. - Lock+unlock on a contended no-op critical section: 2x slower than SpinLock and 100x faster than a system mutex. - CPU time spent in lock() on a lock held for a while: same as system mutex, 90x less than a SpinLock. - Memory usage: sizeof(void*), so on 64-bit it's 8x less than a system mutex but 2x worse than a SpinLock. This patch replaces all uses of SpinLock with Lock, since our critical sections are not no-ops so if you do basically anything in your critical section, the Lock overhead will be invisible. Also, in all places where we used SpinLock, we could tolerate 8 bytes of overhead instead of 4. Performance benchmarking using JSC macrobenchmarks shows no difference, which is as it should be: the purpose of this change is to reduce CPU time wasted, not wallclock time. This patch doesn't replace any uses of ByteSpinLock, since we expect that the space benefits of having a lock that just uses a byte are still better than the CPU wastage benefits of Lock. But, this work will enable some future work to create locks that will fit in just 1.6 bits: https://bugs.webkit.org/show_bug.cgi?id=147665. Rolling this back in after fixing Lock::unlockSlow() for architectures that have a truly weak CAS. Since the Lock::unlock() fast path can go to slow path spuriously, it may go there even if there aren't any threads on the Lock's queue. So, unlockSlow() must be able to deal with the possibility of a null queue head. [1] http://www.filpizlo.com/papers/pizlo-pppj2011-fable.pdf * WTF.vcxproj/WTF.vcxproj: * WTF.xcodeproj/project.pbxproj: * benchmarks: Added. * benchmarks/LockSpeedTest.cpp: Added. (main): * wtf/Atomics.h: (WTF::Atomic::compareExchangeWeak): (WTF::Atomic::compareExchangeStrong): * wtf/CMakeLists.txt: * wtf/Lock.cpp: Added. (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): * wtf/Lock.h: Added. (WTF::LockBase::lock): (WTF::LockBase::unlock): (WTF::LockBase::isHeld): (WTF::LockBase::isLocked): (WTF::Lock::Lock): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocatorHandle::shrink): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): (WTF::MetaAllocator::addFreshFreeSpace): (WTF::MetaAllocator::debugFreeSpaceSize): * wtf/MetaAllocator.h: * wtf/SpinLock.h: * wtf/ThreadingPthreads.cpp: * wtf/ThreadingWin.cpp: * wtf/text/AtomicString.cpp: * wtf/text/AtomicStringImpl.cpp: (WTF::AtomicStringTableLocker::AtomicStringTableLocker): Tools: * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.vcxproj/TestWebKitAPI.vcxproj: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Lock.cpp: Added. (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/165908@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188169 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-07 22:38:59 +00:00
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
* EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
* PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR
* CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
* EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
* PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
* PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
* OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
Use pragma once in WTF https://bugs.webkit.org/show_bug.cgi?id=190527 Reviewed by Chris Dumez. Source/WTF: We also need to consistently include wtf headers from within wtf so we can build wtf without symbol redefinition errors from including the copy in Source and the copy in the build directory. * wtf/ASCIICType.h: * wtf/Assertions.cpp: * wtf/Assertions.h: * wtf/Atomics.h: * wtf/AutomaticThread.cpp: * wtf/AutomaticThread.h: * wtf/BackwardsGraph.h: * wtf/Bag.h: * wtf/BagToHashMap.h: * wtf/BitVector.cpp: * wtf/BitVector.h: * wtf/Bitmap.h: * wtf/BloomFilter.h: * wtf/Box.h: * wtf/BubbleSort.h: * wtf/BumpPointerAllocator.h: * wtf/ByteOrder.h: * wtf/CPUTime.cpp: * wtf/CallbackAggregator.h: * wtf/CheckedArithmetic.h: * wtf/CheckedBoolean.h: * wtf/ClockType.cpp: * wtf/ClockType.h: * wtf/CommaPrinter.h: * wtf/CompilationThread.cpp: * wtf/CompilationThread.h: * wtf/Compiler.h: * wtf/ConcurrentPtrHashSet.cpp: * wtf/ConcurrentVector.h: * wtf/Condition.h: * wtf/CountingLock.cpp: * wtf/CrossThreadTaskHandler.cpp: * wtf/CryptographicUtilities.cpp: * wtf/CryptographicUtilities.h: * wtf/CryptographicallyRandomNumber.cpp: * wtf/CryptographicallyRandomNumber.h: * wtf/CurrentTime.cpp: * wtf/DataLog.cpp: * wtf/DataLog.h: * wtf/DateMath.cpp: * wtf/DateMath.h: * wtf/DecimalNumber.cpp: * wtf/DecimalNumber.h: * wtf/Deque.h: * wtf/DisallowCType.h: * wtf/Dominators.h: * wtf/DoublyLinkedList.h: * wtf/FastBitVector.cpp: * wtf/FastMalloc.cpp: * wtf/FastMalloc.h: * wtf/FeatureDefines.h: * wtf/FilePrintStream.cpp: * wtf/FilePrintStream.h: * wtf/FlipBytes.h: * wtf/FunctionDispatcher.cpp: * wtf/FunctionDispatcher.h: * wtf/GetPtr.h: * wtf/Gigacage.cpp: * wtf/GlobalVersion.cpp: * wtf/GraphNodeWorklist.h: * wtf/GregorianDateTime.cpp: * wtf/GregorianDateTime.h: * wtf/HashFunctions.h: * wtf/HashMap.h: * wtf/HashMethod.h: * wtf/HashSet.h: * wtf/HashTable.cpp: * wtf/HashTraits.h: * wtf/Indenter.h: * wtf/IndexSparseSet.h: * wtf/InlineASM.h: * wtf/Insertion.h: * wtf/IteratorAdaptors.h: * wtf/IteratorRange.h: * wtf/JSONValues.cpp: * wtf/JSValueMalloc.cpp: * wtf/LEBDecoder.h: * wtf/Language.cpp: * wtf/ListDump.h: * wtf/Lock.cpp: * wtf/Lock.h: * wtf/LockAlgorithm.h: * wtf/LockedPrintStream.cpp: * wtf/Locker.h: * wtf/MD5.cpp: * wtf/MD5.h: * wtf/MainThread.cpp: * wtf/MainThread.h: * wtf/MallocPtr.h: * wtf/MathExtras.h: * wtf/MediaTime.cpp: * wtf/MediaTime.h: * wtf/MemoryPressureHandler.cpp: * wtf/MessageQueue.h: * wtf/MetaAllocator.cpp: * wtf/MetaAllocator.h: * wtf/MetaAllocatorHandle.h: * wtf/MonotonicTime.cpp: * wtf/MonotonicTime.h: * wtf/NakedPtr.h: * wtf/NoLock.h: * wtf/NoTailCalls.h: * wtf/Noncopyable.h: * wtf/NumberOfCores.cpp: * wtf/NumberOfCores.h: * wtf/OSAllocator.h: * wtf/OSAllocatorPosix.cpp: * wtf/OSRandomSource.cpp: * wtf/OSRandomSource.h: * wtf/ObjcRuntimeExtras.h: * wtf/OrderMaker.h: * wtf/PackedIntVector.h: * wtf/PageAllocation.h: * wtf/PageBlock.cpp: * wtf/PageBlock.h: * wtf/PageReservation.h: * wtf/ParallelHelperPool.cpp: * wtf/ParallelHelperPool.h: * wtf/ParallelJobs.h: * wtf/ParallelJobsLibdispatch.h: * wtf/ParallelVectorIterator.h: * wtf/ParkingLot.cpp: * wtf/ParkingLot.h: * wtf/Platform.h: * wtf/PointerComparison.h: * wtf/Poisoned.cpp: * wtf/PrintStream.cpp: * wtf/PrintStream.h: * wtf/ProcessID.h: * wtf/ProcessPrivilege.cpp: * wtf/RAMSize.cpp: * wtf/RAMSize.h: * wtf/RandomDevice.cpp: * wtf/RandomNumber.cpp: * wtf/RandomNumber.h: * wtf/RandomNumberSeed.h: * wtf/RangeSet.h: * wtf/RawPointer.h: * wtf/ReadWriteLock.cpp: * wtf/RedBlackTree.h: * wtf/Ref.h: * wtf/RefCountedArray.h: * wtf/RefCountedLeakCounter.cpp: * wtf/RefCountedLeakCounter.h: * wtf/RefCounter.h: * wtf/RefPtr.h: * wtf/RetainPtr.h: * wtf/RunLoop.cpp: * wtf/RunLoop.h: * wtf/RunLoopTimer.h: * wtf/RunLoopTimerCF.cpp: * wtf/SHA1.cpp: * wtf/SHA1.h: * wtf/SaturatedArithmetic.h: (saturatedSubtraction): * wtf/SchedulePair.h: * wtf/SchedulePairCF.cpp: * wtf/SchedulePairMac.mm: * wtf/ScopedLambda.h: * wtf/Seconds.cpp: * wtf/Seconds.h: * wtf/SegmentedVector.h: * wtf/SentinelLinkedList.h: * wtf/SharedTask.h: * wtf/SimpleStats.h: * wtf/SingleRootGraph.h: * wtf/SinglyLinkedList.h: * wtf/SixCharacterHash.cpp: * wtf/SixCharacterHash.h: * wtf/SmallPtrSet.h: * wtf/Spectrum.h: * wtf/StackBounds.cpp: * wtf/StackBounds.h: * wtf/StackStats.cpp: * wtf/StackStats.h: * wtf/StackTrace.cpp: * wtf/StdLibExtras.h: * wtf/StreamBuffer.h: * wtf/StringHashDumpContext.h: * wtf/StringPrintStream.cpp: * wtf/StringPrintStream.h: * wtf/ThreadGroup.cpp: * wtf/ThreadMessage.cpp: * wtf/ThreadSpecific.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPrimitives.h: * wtf/ThreadingPthreads.cpp: * wtf/TimeWithDynamicClockType.cpp: * wtf/TimeWithDynamicClockType.h: * wtf/TimingScope.cpp: * wtf/TinyLRUCache.h: * wtf/TinyPtrSet.h: * wtf/TriState.h: * wtf/TypeCasts.h: * wtf/UUID.cpp: * wtf/UnionFind.h: * wtf/VMTags.h: * wtf/ValueCheck.h: * wtf/Vector.h: * wtf/VectorTraits.h: * wtf/WallTime.cpp: * wtf/WallTime.h: * wtf/WeakPtr.h: * wtf/WeakRandom.h: * wtf/WordLock.cpp: * wtf/WordLock.h: * wtf/WorkQueue.cpp: * wtf/WorkQueue.h: * wtf/WorkerPool.cpp: * wtf/cf/LanguageCF.cpp: * wtf/cf/RunLoopCF.cpp: * wtf/cocoa/Entitlements.mm: * wtf/cocoa/MachSendRight.cpp: * wtf/cocoa/MainThreadCocoa.mm: * wtf/cocoa/MemoryFootprintCocoa.cpp: * wtf/cocoa/WorkQueueCocoa.cpp: * wtf/dtoa.cpp: * wtf/dtoa.h: * wtf/ios/WebCoreThread.cpp: * wtf/ios/WebCoreThread.h: * wtf/mac/AppKitCompatibilityDeclarations.h: * wtf/mac/DeprecatedSymbolsUsedBySafari.mm: * wtf/mbmalloc.cpp: * wtf/persistence/PersistentCoders.cpp: * wtf/persistence/PersistentDecoder.cpp: * wtf/persistence/PersistentEncoder.cpp: * wtf/spi/cf/CFBundleSPI.h: * wtf/spi/darwin/CommonCryptoSPI.h: * wtf/text/ASCIIFastPath.h: * wtf/text/ASCIILiteral.cpp: * wtf/text/AtomicString.cpp: * wtf/text/AtomicString.h: * wtf/text/AtomicStringHash.h: * wtf/text/AtomicStringImpl.cpp: * wtf/text/AtomicStringImpl.h: * wtf/text/AtomicStringTable.cpp: * wtf/text/AtomicStringTable.h: * wtf/text/Base64.cpp: * wtf/text/CString.cpp: * wtf/text/CString.h: * wtf/text/ConversionMode.h: * wtf/text/ExternalStringImpl.cpp: * wtf/text/IntegerToStringConversion.h: * wtf/text/LChar.h: * wtf/text/LineEnding.cpp: * wtf/text/StringBuffer.h: * wtf/text/StringBuilder.cpp: * wtf/text/StringBuilder.h: * wtf/text/StringBuilderJSON.cpp: * wtf/text/StringCommon.h: * wtf/text/StringConcatenate.h: * wtf/text/StringHash.h: * wtf/text/StringImpl.cpp: * wtf/text/StringImpl.h: * wtf/text/StringOperators.h: * wtf/text/StringView.cpp: * wtf/text/StringView.h: * wtf/text/SymbolImpl.cpp: * wtf/text/SymbolRegistry.cpp: * wtf/text/SymbolRegistry.h: * wtf/text/TextBreakIterator.cpp: * wtf/text/TextBreakIterator.h: * wtf/text/TextBreakIteratorInternalICU.h: * wtf/text/TextPosition.h: * wtf/text/TextStream.cpp: * wtf/text/UniquedStringImpl.h: * wtf/text/WTFString.cpp: * wtf/text/WTFString.h: * wtf/text/cocoa/StringCocoa.mm: * wtf/text/cocoa/StringViewCocoa.mm: * wtf/text/cocoa/TextBreakIteratorInternalICUCocoa.cpp: * wtf/text/icu/UTextProvider.cpp: * wtf/text/icu/UTextProvider.h: * wtf/text/icu/UTextProviderLatin1.cpp: * wtf/text/icu/UTextProviderLatin1.h: * wtf/text/icu/UTextProviderUTF16.cpp: * wtf/text/icu/UTextProviderUTF16.h: * wtf/threads/BinarySemaphore.cpp: * wtf/threads/BinarySemaphore.h: * wtf/threads/Signals.cpp: * wtf/unicode/CharacterNames.h: * wtf/unicode/Collator.h: * wtf/unicode/CollatorDefault.cpp: * wtf/unicode/UTF8.cpp: * wtf/unicode/UTF8.h: Tools: Put WorkQueue in namespace DRT so it does not conflict with WTF::WorkQueue. * DumpRenderTree/TestRunner.cpp: (TestRunner::queueLoadHTMLString): (TestRunner::queueLoadAlternateHTMLString): (TestRunner::queueBackNavigation): (TestRunner::queueForwardNavigation): (TestRunner::queueLoadingScript): (TestRunner::queueNonLoadingScript): (TestRunner::queueReload): * DumpRenderTree/WorkQueue.cpp: (WorkQueue::singleton): Deleted. (WorkQueue::WorkQueue): Deleted. (WorkQueue::queue): Deleted. (WorkQueue::dequeue): Deleted. (WorkQueue::count): Deleted. (WorkQueue::clear): Deleted. (WorkQueue::processWork): Deleted. * DumpRenderTree/WorkQueue.h: (WorkQueue::setFrozen): Deleted. * DumpRenderTree/WorkQueueItem.h: * DumpRenderTree/mac/DumpRenderTree.mm: (runTest): * DumpRenderTree/mac/FrameLoadDelegate.mm: (-[FrameLoadDelegate processWork:]): (-[FrameLoadDelegate webView:locationChangeDone:forDataSource:]): * DumpRenderTree/mac/TestRunnerMac.mm: (TestRunner::notifyDone): (TestRunner::forceImmediateCompletion): (TestRunner::queueLoad): * DumpRenderTree/win/DumpRenderTree.cpp: (runTest): * DumpRenderTree/win/FrameLoadDelegate.cpp: (FrameLoadDelegate::processWork): (FrameLoadDelegate::locationChangeDone): * DumpRenderTree/win/TestRunnerWin.cpp: (TestRunner::notifyDone): (TestRunner::forceImmediateCompletion): (TestRunner::queueLoad): Canonical link: https://commits.webkit.org/205473@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@237099 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2018-10-15 14:24:49 +00:00
#pragma once
Lightweight locks should be adaptive https://bugs.webkit.org/show_bug.cgi?id=147545 Reviewed by Geoffrey Garen. Source/JavaScriptCore: * dfg/DFGCommon.cpp: (JSC::DFG::startCrashing): * heap/CopiedBlock.h: (JSC::CopiedBlock::workListLock): * heap/CopiedBlockInlines.h: (JSC::CopiedBlock::shouldReportLiveBytes): (JSC::CopiedBlock::reportLiveBytes): * heap/CopiedSpace.cpp: (JSC::CopiedSpace::doneFillingBlock): * heap/CopiedSpace.h: (JSC::CopiedSpace::CopiedGeneration::CopiedGeneration): * heap/CopiedSpaceInlines.h: (JSC::CopiedSpace::recycleEvacuatedBlock): * heap/GCThreadSharedData.cpp: (JSC::GCThreadSharedData::didStartCopying): * heap/GCThreadSharedData.h: (JSC::GCThreadSharedData::getNextBlocksToCopy): * heap/ListableHandler.h: (JSC::ListableHandler::List::addThreadSafe): (JSC::ListableHandler::List::addNotThreadSafe): * heap/MachineStackMarker.cpp: (JSC::MachineThreads::tryCopyOtherThreadStacks): * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::copyLater): * parser/SourceProvider.cpp: (JSC::SourceProvider::~SourceProvider): (JSC::SourceProvider::getID): * profiler/ProfilerDatabase.cpp: (JSC::Profiler::Database::addDatabaseToAtExit): (JSC::Profiler::Database::removeDatabaseFromAtExit): (JSC::Profiler::Database::removeFirstAtExitDatabase): * runtime/TypeProfilerLog.h: Source/WebCore: * bindings/objc/WebScriptObject.mm: (WebCore::getJSWrapper): (WebCore::addJSWrapper): (WebCore::removeJSWrapper): (WebCore::removeJSWrapperIfRetainCountOne): * platform/audio/mac/CARingBuffer.cpp: (WebCore::CARingBuffer::setCurrentFrameBounds): (WebCore::CARingBuffer::getCurrentFrameBounds): * platform/audio/mac/CARingBuffer.h: * platform/ios/wak/WAKWindow.mm: (-[WAKWindow setExposedScrollViewRect:]): (-[WAKWindow exposedScrollViewRect]): Source/WebKit2: * WebProcess/WebPage/EventDispatcher.cpp: (WebKit::EventDispatcher::clearQueuedTouchEventsForPage): (WebKit::EventDispatcher::getQueuedTouchEventsForPage): (WebKit::EventDispatcher::touchEvent): (WebKit::EventDispatcher::dispatchTouchEvents): * WebProcess/WebPage/EventDispatcher.h: * WebProcess/WebPage/ViewUpdateDispatcher.cpp: (WebKit::ViewUpdateDispatcher::visibleContentRectUpdate): (WebKit::ViewUpdateDispatcher::dispatchVisibleContentRectUpdate): * WebProcess/WebPage/ViewUpdateDispatcher.h: Source/WTF: A common idiom in WebKit is to use spinlocks. We use them because the lock acquisition overhead is lower than system locks and because they take dramatically less space than system locks. The speed and space advantages of spinlocks can be astonishing: an uncontended spinlock acquire is up to 10x faster and under microcontention - short critical section with two or more threads taking turns - spinlocks are up to 100x faster. Spinlocks take only 1 byte or 4 bytes depending on the flavor, while system locks take 64 bytes or more. Clearly, WebKit should continue to avoid system locks - they are just far too slow and far too big. But there is a problem with this idiom. System lock implementations will sleep a thread when it attempts to acquire a lock that is held, while spinlocks will cause the thread to burn CPU. In WebKit spinlocks, the thread will repeatedly call sched_yield(). This is awesome for microcontention, but awful when the lock will not be released for a while. In fact, when critical sections take tens of microseconds or more, the CPU time cost of our spinlocks is almost 100x more than the CPU time cost of a system lock. This case doesn't arise too frequently in our current uses of spinlocks, but that's probably because right now there are places where we make a conscious decision to use system locks - even though they use more memory and are slower - because we don't want to waste CPU cycles when a thread has to wait a while to acquire the lock. The solution is to just implement a modern adaptive mutex in WTF. Luckily, this isn't a new concept. This patch implements a mutex that is reminiscent of the kinds of low-overhead locks that JVMs use. The actual implementation here is inspired by some of the ideas from [1]. The idea is simple: the fast path is an inlined CAS to immediately acquire a lock that isn't held, the slow path tries some number of spins to acquire the lock, and if that fails, the thread is put on a queue and put to sleep. The queue is made up of statically allocated thread nodes and the lock itself is a tagged pointer: either it is just bits telling us the complete lock state (not held or held) or it is a pointer to the head of a queue of threads waiting to acquire the lock. This approach gives WTF::Lock three different levels of adaptation: an inlined fast path if the lock is not contended, a short burst of spinning for microcontention, and a full-blown queue for critical sections that are held for a long time. On a locking microbenchmark, this new Lock exhibits the following performance characteristics: - Lock+unlock on an uncontended no-op critical section: 2x slower than SpinLock and 3x faster than a system mutex. - Lock+unlock on a contended no-op critical section: 2x slower than SpinLock and 100x faster than a system mutex. - CPU time spent in lock() on a lock held for a while: same as system mutex, 90x less than a SpinLock. - Memory usage: sizeof(void*), so on 64-bit it's 8x less than a system mutex but 2x worse than a SpinLock. This patch replaces all uses of SpinLock with Lock, since our critical sections are not no-ops so if you do basically anything in your critical section, the Lock overhead will be invisible. Also, in all places where we used SpinLock, we could tolerate 8 bytes of overhead instead of 4. Performance benchmarking using JSC macrobenchmarks shows no difference, which is as it should be: the purpose of this change is to reduce CPU time wasted, not wallclock time. This patch doesn't replace any uses of ByteSpinLock, since we expect that the space benefits of having a lock that just uses a byte are still better than the CPU wastage benefits of Lock. But, this work will enable some future work to create locks that will fit in just 1.6 bits: https://bugs.webkit.org/show_bug.cgi?id=147665. Rolling this back in after fixing Lock::unlockSlow() for architectures that have a truly weak CAS. Since the Lock::unlock() fast path can go to slow path spuriously, it may go there even if there aren't any threads on the Lock's queue. So, unlockSlow() must be able to deal with the possibility of a null queue head. [1] http://www.filpizlo.com/papers/pizlo-pppj2011-fable.pdf * WTF.vcxproj/WTF.vcxproj: * WTF.xcodeproj/project.pbxproj: * benchmarks: Added. * benchmarks/LockSpeedTest.cpp: Added. (main): * wtf/Atomics.h: (WTF::Atomic::compareExchangeWeak): (WTF::Atomic::compareExchangeStrong): * wtf/CMakeLists.txt: * wtf/Lock.cpp: Added. (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): * wtf/Lock.h: Added. (WTF::LockBase::lock): (WTF::LockBase::unlock): (WTF::LockBase::isHeld): (WTF::LockBase::isLocked): (WTF::Lock::Lock): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocatorHandle::shrink): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): (WTF::MetaAllocator::addFreshFreeSpace): (WTF::MetaAllocator::debugFreeSpaceSize): * wtf/MetaAllocator.h: * wtf/SpinLock.h: * wtf/ThreadingPthreads.cpp: * wtf/ThreadingWin.cpp: * wtf/text/AtomicString.cpp: * wtf/text/AtomicStringImpl.cpp: (WTF::AtomicStringTableLocker::AtomicStringTableLocker): Tools: * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.vcxproj/TestWebKitAPI.vcxproj: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Lock.cpp: Added. (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/165908@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188169 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-07 22:38:59 +00:00
Make CheckedLock the default Lock https://bugs.webkit.org/show_bug.cgi?id=226157 Reviewed by Darin Adler. Make CheckedLock the default Lock so that we get more benefits from Clang Thread Safety Analysis. Note that CheckedLock 100% relies on the existing Source/JavaScriptCore: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * assembler/testmasm.cpp: * dfg/DFGCommon.cpp: * dfg/DFGThreadData.h: * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::Worklist): * dfg/DFGWorklist.h: * dynbench.cpp: * heap/BlockDirectory.h: (JSC::BlockDirectory::bitvectorLock): * heap/CodeBlockSet.h: (JSC::CodeBlockSet::getLock): * heap/Heap.cpp: (JSC::Heap::Heap): * heap/Heap.h: * heap/MarkedSpace.h: (JSC::MarkedSpace::directoryLock): * heap/MarkingConstraintSolver.h: * heap/SlotVisitor.cpp: (JSC::SlotVisitor::donateKnownParallel): * heap/SlotVisitor.h: * jit/ExecutableAllocator.cpp: (JSC::ExecutableAllocator::getLock const): (JSC::dumpJITMemory): * jit/ExecutableAllocator.h: (JSC::ExecutableAllocatorBase::getLock const): * jit/JITWorklist.cpp: (JSC::JITWorklist::JITWorklist): * jit/JITWorklist.h: * jsc.cpp: * profiler/ProfilerDatabase.h: * runtime/ConcurrentJSLock.h: * runtime/DeferredWorkTimer.h: * runtime/JSLock.h: * runtime/SamplingProfiler.cpp: (JSC::FrameWalker::FrameWalker): (JSC::CFrameWalker::CFrameWalker): (JSC::SamplingProfiler::takeSample): * runtime/SamplingProfiler.h: (JSC::SamplingProfiler::getLock): * runtime/VM.h: * runtime/VMTraps.cpp: (JSC::VMTraps::invalidateCodeBlocksOnStack): (JSC::VMTraps::VMTraps): * runtime/VMTraps.h: * tools/FunctionOverrides.h: * tools/VMInspector.cpp: (JSC::ensureIsSafeToLock): * tools/VMInspector.h: (JSC::VMInspector::getLock): * wasm/WasmCalleeRegistry.h: (JSC::Wasm::CalleeRegistry::getLock): * wasm/WasmPlan.h: * wasm/WasmStreamingCompiler.h: * wasm/WasmThunks.h: * wasm/WasmWorklist.cpp: (JSC::Wasm::Worklist::Worklist): * wasm/WasmWorklist.h: Source/WebCore: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * Modules/indexeddb/server/IDBServer.cpp: * Modules/webaudio/MediaElementAudioSourceNode.h: * Modules/webdatabase/OriginLock.cpp: * bindings/js/JSDOMGlobalObject.h: * dom/Node.cpp: * html/HTMLMediaElement.cpp: (WebCore::HTMLMediaElement::createMediaPlayer): * html/canvas/WebGLContextGroup.cpp: (WebCore::WebGLContextGroup::objectGraphLockForAContext): * html/canvas/WebGLContextGroup.h: * html/canvas/WebGLContextObject.cpp: (WebCore::WebGLContextObject::objectGraphLockForContext): * html/canvas/WebGLContextObject.h: * html/canvas/WebGLObject.h: * html/canvas/WebGLRenderingContextBase.cpp: (WebCore::WebGLRenderingContextBase::objectGraphLock): * html/canvas/WebGLRenderingContextBase.h: * html/canvas/WebGLSharedObject.cpp: (WebCore::WebGLSharedObject::objectGraphLockForContext): * html/canvas/WebGLSharedObject.h: * page/scrolling/mac/ScrollingTreeMac.h: * platform/audio/ReverbConvolver.cpp: (WebCore::ReverbConvolver::backgroundThreadEntry): * platform/graphics/ShadowBlur.cpp: (WebCore::ScratchBuffer::lock): (WebCore::ShadowBlur::drawRectShadowWithTiling): (WebCore::ShadowBlur::drawInsetShadowWithTiling): * platform/graphics/gstreamer/VideoSinkGStreamer.cpp: * platform/graphics/gstreamer/eme/WebKitCommonEncryptionDecryptorGStreamer.cpp: Source/WebKit: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * GPUProcess/graphics/RemoteGraphicsContextGL.cpp: (WebKit::RemoteGraphicsContextGL::paintPixelBufferToImageBuffer): * NetworkProcess/IndexedDB/WebIDBServer.cpp: * UIProcess/API/glib/IconDatabase.h: * UIProcess/mac/WKPrintingView.mm: (-[WKPrintingView knowsPageRange:]): Source/WTF: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * wtf/AutomaticThread.cpp: (WTF::AutomaticThreadCondition::wait): (WTF::AutomaticThreadCondition::waitFor): (WTF::AutomaticThread::AutomaticThread): * wtf/AutomaticThread.h: * wtf/CheckedCondition.h: * wtf/CheckedLock.h: * wtf/Condition.h: * wtf/Lock.cpp: (WTF::UncheckedLock::lockSlow): (WTF::UncheckedLock::unlockSlow): (WTF::UncheckedLock::unlockFairlySlow): (WTF::UncheckedLock::safepointSlow): * wtf/Lock.h: (WTF::WTF_ASSERTS_ACQUIRED_LOCK): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocator::MetaAllocator): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): * wtf/MetaAllocator.h: * wtf/ParallelHelperPool.cpp: (WTF::ParallelHelperPool::ParallelHelperPool): * wtf/ParallelHelperPool.h: * wtf/RecursiveLockAdapter.h: * wtf/WorkerPool.cpp: (WTF::WorkerPool::WorkerPool): * wtf/WorkerPool.h: Tools: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * TestWebKitAPI/Tests/WTF/CheckedConditionTest.cpp: * TestWebKitAPI/Tests/WTF/Condition.cpp: * TestWebKitAPI/Tests/WTF/MetaAllocator.cpp: * WebKitTestRunner/InjectedBundle/AccessibilityController.cpp: (WTR::AXThread::createThreadIfNeeded): Canonical link: https://commits.webkit.org/238070@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@277943 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2021-05-24 05:37:41 +00:00
#include <mutex>
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
#include <wtf/LockAlgorithm.h>
Lightweight locks should be adaptive https://bugs.webkit.org/show_bug.cgi?id=147545 Reviewed by Geoffrey Garen. Source/JavaScriptCore: * dfg/DFGCommon.cpp: (JSC::DFG::startCrashing): * heap/CopiedBlock.h: (JSC::CopiedBlock::workListLock): * heap/CopiedBlockInlines.h: (JSC::CopiedBlock::shouldReportLiveBytes): (JSC::CopiedBlock::reportLiveBytes): * heap/CopiedSpace.cpp: (JSC::CopiedSpace::doneFillingBlock): * heap/CopiedSpace.h: (JSC::CopiedSpace::CopiedGeneration::CopiedGeneration): * heap/CopiedSpaceInlines.h: (JSC::CopiedSpace::recycleEvacuatedBlock): * heap/GCThreadSharedData.cpp: (JSC::GCThreadSharedData::didStartCopying): * heap/GCThreadSharedData.h: (JSC::GCThreadSharedData::getNextBlocksToCopy): * heap/ListableHandler.h: (JSC::ListableHandler::List::addThreadSafe): (JSC::ListableHandler::List::addNotThreadSafe): * heap/MachineStackMarker.cpp: (JSC::MachineThreads::tryCopyOtherThreadStacks): * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::copyLater): * parser/SourceProvider.cpp: (JSC::SourceProvider::~SourceProvider): (JSC::SourceProvider::getID): * profiler/ProfilerDatabase.cpp: (JSC::Profiler::Database::addDatabaseToAtExit): (JSC::Profiler::Database::removeDatabaseFromAtExit): (JSC::Profiler::Database::removeFirstAtExitDatabase): * runtime/TypeProfilerLog.h: Source/WebCore: * bindings/objc/WebScriptObject.mm: (WebCore::getJSWrapper): (WebCore::addJSWrapper): (WebCore::removeJSWrapper): (WebCore::removeJSWrapperIfRetainCountOne): * platform/audio/mac/CARingBuffer.cpp: (WebCore::CARingBuffer::setCurrentFrameBounds): (WebCore::CARingBuffer::getCurrentFrameBounds): * platform/audio/mac/CARingBuffer.h: * platform/ios/wak/WAKWindow.mm: (-[WAKWindow setExposedScrollViewRect:]): (-[WAKWindow exposedScrollViewRect]): Source/WebKit2: * WebProcess/WebPage/EventDispatcher.cpp: (WebKit::EventDispatcher::clearQueuedTouchEventsForPage): (WebKit::EventDispatcher::getQueuedTouchEventsForPage): (WebKit::EventDispatcher::touchEvent): (WebKit::EventDispatcher::dispatchTouchEvents): * WebProcess/WebPage/EventDispatcher.h: * WebProcess/WebPage/ViewUpdateDispatcher.cpp: (WebKit::ViewUpdateDispatcher::visibleContentRectUpdate): (WebKit::ViewUpdateDispatcher::dispatchVisibleContentRectUpdate): * WebProcess/WebPage/ViewUpdateDispatcher.h: Source/WTF: A common idiom in WebKit is to use spinlocks. We use them because the lock acquisition overhead is lower than system locks and because they take dramatically less space than system locks. The speed and space advantages of spinlocks can be astonishing: an uncontended spinlock acquire is up to 10x faster and under microcontention - short critical section with two or more threads taking turns - spinlocks are up to 100x faster. Spinlocks take only 1 byte or 4 bytes depending on the flavor, while system locks take 64 bytes or more. Clearly, WebKit should continue to avoid system locks - they are just far too slow and far too big. But there is a problem with this idiom. System lock implementations will sleep a thread when it attempts to acquire a lock that is held, while spinlocks will cause the thread to burn CPU. In WebKit spinlocks, the thread will repeatedly call sched_yield(). This is awesome for microcontention, but awful when the lock will not be released for a while. In fact, when critical sections take tens of microseconds or more, the CPU time cost of our spinlocks is almost 100x more than the CPU time cost of a system lock. This case doesn't arise too frequently in our current uses of spinlocks, but that's probably because right now there are places where we make a conscious decision to use system locks - even though they use more memory and are slower - because we don't want to waste CPU cycles when a thread has to wait a while to acquire the lock. The solution is to just implement a modern adaptive mutex in WTF. Luckily, this isn't a new concept. This patch implements a mutex that is reminiscent of the kinds of low-overhead locks that JVMs use. The actual implementation here is inspired by some of the ideas from [1]. The idea is simple: the fast path is an inlined CAS to immediately acquire a lock that isn't held, the slow path tries some number of spins to acquire the lock, and if that fails, the thread is put on a queue and put to sleep. The queue is made up of statically allocated thread nodes and the lock itself is a tagged pointer: either it is just bits telling us the complete lock state (not held or held) or it is a pointer to the head of a queue of threads waiting to acquire the lock. This approach gives WTF::Lock three different levels of adaptation: an inlined fast path if the lock is not contended, a short burst of spinning for microcontention, and a full-blown queue for critical sections that are held for a long time. On a locking microbenchmark, this new Lock exhibits the following performance characteristics: - Lock+unlock on an uncontended no-op critical section: 2x slower than SpinLock and 3x faster than a system mutex. - Lock+unlock on a contended no-op critical section: 2x slower than SpinLock and 100x faster than a system mutex. - CPU time spent in lock() on a lock held for a while: same as system mutex, 90x less than a SpinLock. - Memory usage: sizeof(void*), so on 64-bit it's 8x less than a system mutex but 2x worse than a SpinLock. This patch replaces all uses of SpinLock with Lock, since our critical sections are not no-ops so if you do basically anything in your critical section, the Lock overhead will be invisible. Also, in all places where we used SpinLock, we could tolerate 8 bytes of overhead instead of 4. Performance benchmarking using JSC macrobenchmarks shows no difference, which is as it should be: the purpose of this change is to reduce CPU time wasted, not wallclock time. This patch doesn't replace any uses of ByteSpinLock, since we expect that the space benefits of having a lock that just uses a byte are still better than the CPU wastage benefits of Lock. But, this work will enable some future work to create locks that will fit in just 1.6 bits: https://bugs.webkit.org/show_bug.cgi?id=147665. Rolling this back in after fixing Lock::unlockSlow() for architectures that have a truly weak CAS. Since the Lock::unlock() fast path can go to slow path spuriously, it may go there even if there aren't any threads on the Lock's queue. So, unlockSlow() must be able to deal with the possibility of a null queue head. [1] http://www.filpizlo.com/papers/pizlo-pppj2011-fable.pdf * WTF.vcxproj/WTF.vcxproj: * WTF.xcodeproj/project.pbxproj: * benchmarks: Added. * benchmarks/LockSpeedTest.cpp: Added. (main): * wtf/Atomics.h: (WTF::Atomic::compareExchangeWeak): (WTF::Atomic::compareExchangeStrong): * wtf/CMakeLists.txt: * wtf/Lock.cpp: Added. (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): * wtf/Lock.h: Added. (WTF::LockBase::lock): (WTF::LockBase::unlock): (WTF::LockBase::isHeld): (WTF::LockBase::isLocked): (WTF::Lock::Lock): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocatorHandle::shrink): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): (WTF::MetaAllocator::addFreshFreeSpace): (WTF::MetaAllocator::debugFreeSpaceSize): * wtf/MetaAllocator.h: * wtf/SpinLock.h: * wtf/ThreadingPthreads.cpp: * wtf/ThreadingWin.cpp: * wtf/text/AtomicString.cpp: * wtf/text/AtomicStringImpl.cpp: (WTF::AtomicStringTableLocker::AtomicStringTableLocker): Tools: * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.vcxproj/TestWebKitAPI.vcxproj: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Lock.cpp: Added. (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/165908@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188169 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-07 22:38:59 +00:00
#include <wtf/Locker.h>
#include <wtf/Noncopyable.h>
#include <wtf/Seconds.h>
Make CheckedLock the default Lock https://bugs.webkit.org/show_bug.cgi?id=226157 Reviewed by Darin Adler. Make CheckedLock the default Lock so that we get more benefits from Clang Thread Safety Analysis. Note that CheckedLock 100% relies on the existing Source/JavaScriptCore: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * assembler/testmasm.cpp: * dfg/DFGCommon.cpp: * dfg/DFGThreadData.h: * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::Worklist): * dfg/DFGWorklist.h: * dynbench.cpp: * heap/BlockDirectory.h: (JSC::BlockDirectory::bitvectorLock): * heap/CodeBlockSet.h: (JSC::CodeBlockSet::getLock): * heap/Heap.cpp: (JSC::Heap::Heap): * heap/Heap.h: * heap/MarkedSpace.h: (JSC::MarkedSpace::directoryLock): * heap/MarkingConstraintSolver.h: * heap/SlotVisitor.cpp: (JSC::SlotVisitor::donateKnownParallel): * heap/SlotVisitor.h: * jit/ExecutableAllocator.cpp: (JSC::ExecutableAllocator::getLock const): (JSC::dumpJITMemory): * jit/ExecutableAllocator.h: (JSC::ExecutableAllocatorBase::getLock const): * jit/JITWorklist.cpp: (JSC::JITWorklist::JITWorklist): * jit/JITWorklist.h: * jsc.cpp: * profiler/ProfilerDatabase.h: * runtime/ConcurrentJSLock.h: * runtime/DeferredWorkTimer.h: * runtime/JSLock.h: * runtime/SamplingProfiler.cpp: (JSC::FrameWalker::FrameWalker): (JSC::CFrameWalker::CFrameWalker): (JSC::SamplingProfiler::takeSample): * runtime/SamplingProfiler.h: (JSC::SamplingProfiler::getLock): * runtime/VM.h: * runtime/VMTraps.cpp: (JSC::VMTraps::invalidateCodeBlocksOnStack): (JSC::VMTraps::VMTraps): * runtime/VMTraps.h: * tools/FunctionOverrides.h: * tools/VMInspector.cpp: (JSC::ensureIsSafeToLock): * tools/VMInspector.h: (JSC::VMInspector::getLock): * wasm/WasmCalleeRegistry.h: (JSC::Wasm::CalleeRegistry::getLock): * wasm/WasmPlan.h: * wasm/WasmStreamingCompiler.h: * wasm/WasmThunks.h: * wasm/WasmWorklist.cpp: (JSC::Wasm::Worklist::Worklist): * wasm/WasmWorklist.h: Source/WebCore: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * Modules/indexeddb/server/IDBServer.cpp: * Modules/webaudio/MediaElementAudioSourceNode.h: * Modules/webdatabase/OriginLock.cpp: * bindings/js/JSDOMGlobalObject.h: * dom/Node.cpp: * html/HTMLMediaElement.cpp: (WebCore::HTMLMediaElement::createMediaPlayer): * html/canvas/WebGLContextGroup.cpp: (WebCore::WebGLContextGroup::objectGraphLockForAContext): * html/canvas/WebGLContextGroup.h: * html/canvas/WebGLContextObject.cpp: (WebCore::WebGLContextObject::objectGraphLockForContext): * html/canvas/WebGLContextObject.h: * html/canvas/WebGLObject.h: * html/canvas/WebGLRenderingContextBase.cpp: (WebCore::WebGLRenderingContextBase::objectGraphLock): * html/canvas/WebGLRenderingContextBase.h: * html/canvas/WebGLSharedObject.cpp: (WebCore::WebGLSharedObject::objectGraphLockForContext): * html/canvas/WebGLSharedObject.h: * page/scrolling/mac/ScrollingTreeMac.h: * platform/audio/ReverbConvolver.cpp: (WebCore::ReverbConvolver::backgroundThreadEntry): * platform/graphics/ShadowBlur.cpp: (WebCore::ScratchBuffer::lock): (WebCore::ShadowBlur::drawRectShadowWithTiling): (WebCore::ShadowBlur::drawInsetShadowWithTiling): * platform/graphics/gstreamer/VideoSinkGStreamer.cpp: * platform/graphics/gstreamer/eme/WebKitCommonEncryptionDecryptorGStreamer.cpp: Source/WebKit: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * GPUProcess/graphics/RemoteGraphicsContextGL.cpp: (WebKit::RemoteGraphicsContextGL::paintPixelBufferToImageBuffer): * NetworkProcess/IndexedDB/WebIDBServer.cpp: * UIProcess/API/glib/IconDatabase.h: * UIProcess/mac/WKPrintingView.mm: (-[WKPrintingView knowsPageRange:]): Source/WTF: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * wtf/AutomaticThread.cpp: (WTF::AutomaticThreadCondition::wait): (WTF::AutomaticThreadCondition::waitFor): (WTF::AutomaticThread::AutomaticThread): * wtf/AutomaticThread.h: * wtf/CheckedCondition.h: * wtf/CheckedLock.h: * wtf/Condition.h: * wtf/Lock.cpp: (WTF::UncheckedLock::lockSlow): (WTF::UncheckedLock::unlockSlow): (WTF::UncheckedLock::unlockFairlySlow): (WTF::UncheckedLock::safepointSlow): * wtf/Lock.h: (WTF::WTF_ASSERTS_ACQUIRED_LOCK): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocator::MetaAllocator): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): * wtf/MetaAllocator.h: * wtf/ParallelHelperPool.cpp: (WTF::ParallelHelperPool::ParallelHelperPool): * wtf/ParallelHelperPool.h: * wtf/RecursiveLockAdapter.h: * wtf/WorkerPool.cpp: (WTF::WorkerPool::WorkerPool): * wtf/WorkerPool.h: Tools: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * TestWebKitAPI/Tests/WTF/CheckedConditionTest.cpp: * TestWebKitAPI/Tests/WTF/Condition.cpp: * TestWebKitAPI/Tests/WTF/MetaAllocator.cpp: * WebKitTestRunner/InjectedBundle/AccessibilityController.cpp: (WTR::AXThread::createThreadIfNeeded): Canonical link: https://commits.webkit.org/238070@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@277943 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2021-05-24 05:37:41 +00:00
#include <wtf/ThreadSafetyAnalysis.h>
Lightweight locks should be adaptive https://bugs.webkit.org/show_bug.cgi?id=147545 Reviewed by Geoffrey Garen. Source/JavaScriptCore: * dfg/DFGCommon.cpp: (JSC::DFG::startCrashing): * heap/CopiedBlock.h: (JSC::CopiedBlock::workListLock): * heap/CopiedBlockInlines.h: (JSC::CopiedBlock::shouldReportLiveBytes): (JSC::CopiedBlock::reportLiveBytes): * heap/CopiedSpace.cpp: (JSC::CopiedSpace::doneFillingBlock): * heap/CopiedSpace.h: (JSC::CopiedSpace::CopiedGeneration::CopiedGeneration): * heap/CopiedSpaceInlines.h: (JSC::CopiedSpace::recycleEvacuatedBlock): * heap/GCThreadSharedData.cpp: (JSC::GCThreadSharedData::didStartCopying): * heap/GCThreadSharedData.h: (JSC::GCThreadSharedData::getNextBlocksToCopy): * heap/ListableHandler.h: (JSC::ListableHandler::List::addThreadSafe): (JSC::ListableHandler::List::addNotThreadSafe): * heap/MachineStackMarker.cpp: (JSC::MachineThreads::tryCopyOtherThreadStacks): * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::copyLater): * parser/SourceProvider.cpp: (JSC::SourceProvider::~SourceProvider): (JSC::SourceProvider::getID): * profiler/ProfilerDatabase.cpp: (JSC::Profiler::Database::addDatabaseToAtExit): (JSC::Profiler::Database::removeDatabaseFromAtExit): (JSC::Profiler::Database::removeFirstAtExitDatabase): * runtime/TypeProfilerLog.h: Source/WebCore: * bindings/objc/WebScriptObject.mm: (WebCore::getJSWrapper): (WebCore::addJSWrapper): (WebCore::removeJSWrapper): (WebCore::removeJSWrapperIfRetainCountOne): * platform/audio/mac/CARingBuffer.cpp: (WebCore::CARingBuffer::setCurrentFrameBounds): (WebCore::CARingBuffer::getCurrentFrameBounds): * platform/audio/mac/CARingBuffer.h: * platform/ios/wak/WAKWindow.mm: (-[WAKWindow setExposedScrollViewRect:]): (-[WAKWindow exposedScrollViewRect]): Source/WebKit2: * WebProcess/WebPage/EventDispatcher.cpp: (WebKit::EventDispatcher::clearQueuedTouchEventsForPage): (WebKit::EventDispatcher::getQueuedTouchEventsForPage): (WebKit::EventDispatcher::touchEvent): (WebKit::EventDispatcher::dispatchTouchEvents): * WebProcess/WebPage/EventDispatcher.h: * WebProcess/WebPage/ViewUpdateDispatcher.cpp: (WebKit::ViewUpdateDispatcher::visibleContentRectUpdate): (WebKit::ViewUpdateDispatcher::dispatchVisibleContentRectUpdate): * WebProcess/WebPage/ViewUpdateDispatcher.h: Source/WTF: A common idiom in WebKit is to use spinlocks. We use them because the lock acquisition overhead is lower than system locks and because they take dramatically less space than system locks. The speed and space advantages of spinlocks can be astonishing: an uncontended spinlock acquire is up to 10x faster and under microcontention - short critical section with two or more threads taking turns - spinlocks are up to 100x faster. Spinlocks take only 1 byte or 4 bytes depending on the flavor, while system locks take 64 bytes or more. Clearly, WebKit should continue to avoid system locks - they are just far too slow and far too big. But there is a problem with this idiom. System lock implementations will sleep a thread when it attempts to acquire a lock that is held, while spinlocks will cause the thread to burn CPU. In WebKit spinlocks, the thread will repeatedly call sched_yield(). This is awesome for microcontention, but awful when the lock will not be released for a while. In fact, when critical sections take tens of microseconds or more, the CPU time cost of our spinlocks is almost 100x more than the CPU time cost of a system lock. This case doesn't arise too frequently in our current uses of spinlocks, but that's probably because right now there are places where we make a conscious decision to use system locks - even though they use more memory and are slower - because we don't want to waste CPU cycles when a thread has to wait a while to acquire the lock. The solution is to just implement a modern adaptive mutex in WTF. Luckily, this isn't a new concept. This patch implements a mutex that is reminiscent of the kinds of low-overhead locks that JVMs use. The actual implementation here is inspired by some of the ideas from [1]. The idea is simple: the fast path is an inlined CAS to immediately acquire a lock that isn't held, the slow path tries some number of spins to acquire the lock, and if that fails, the thread is put on a queue and put to sleep. The queue is made up of statically allocated thread nodes and the lock itself is a tagged pointer: either it is just bits telling us the complete lock state (not held or held) or it is a pointer to the head of a queue of threads waiting to acquire the lock. This approach gives WTF::Lock three different levels of adaptation: an inlined fast path if the lock is not contended, a short burst of spinning for microcontention, and a full-blown queue for critical sections that are held for a long time. On a locking microbenchmark, this new Lock exhibits the following performance characteristics: - Lock+unlock on an uncontended no-op critical section: 2x slower than SpinLock and 3x faster than a system mutex. - Lock+unlock on a contended no-op critical section: 2x slower than SpinLock and 100x faster than a system mutex. - CPU time spent in lock() on a lock held for a while: same as system mutex, 90x less than a SpinLock. - Memory usage: sizeof(void*), so on 64-bit it's 8x less than a system mutex but 2x worse than a SpinLock. This patch replaces all uses of SpinLock with Lock, since our critical sections are not no-ops so if you do basically anything in your critical section, the Lock overhead will be invisible. Also, in all places where we used SpinLock, we could tolerate 8 bytes of overhead instead of 4. Performance benchmarking using JSC macrobenchmarks shows no difference, which is as it should be: the purpose of this change is to reduce CPU time wasted, not wallclock time. This patch doesn't replace any uses of ByteSpinLock, since we expect that the space benefits of having a lock that just uses a byte are still better than the CPU wastage benefits of Lock. But, this work will enable some future work to create locks that will fit in just 1.6 bits: https://bugs.webkit.org/show_bug.cgi?id=147665. Rolling this back in after fixing Lock::unlockSlow() for architectures that have a truly weak CAS. Since the Lock::unlock() fast path can go to slow path spuriously, it may go there even if there aren't any threads on the Lock's queue. So, unlockSlow() must be able to deal with the possibility of a null queue head. [1] http://www.filpizlo.com/papers/pizlo-pppj2011-fable.pdf * WTF.vcxproj/WTF.vcxproj: * WTF.xcodeproj/project.pbxproj: * benchmarks: Added. * benchmarks/LockSpeedTest.cpp: Added. (main): * wtf/Atomics.h: (WTF::Atomic::compareExchangeWeak): (WTF::Atomic::compareExchangeStrong): * wtf/CMakeLists.txt: * wtf/Lock.cpp: Added. (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): * wtf/Lock.h: Added. (WTF::LockBase::lock): (WTF::LockBase::unlock): (WTF::LockBase::isHeld): (WTF::LockBase::isLocked): (WTF::Lock::Lock): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocatorHandle::shrink): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): (WTF::MetaAllocator::addFreshFreeSpace): (WTF::MetaAllocator::debugFreeSpaceSize): * wtf/MetaAllocator.h: * wtf/SpinLock.h: * wtf/ThreadingPthreads.cpp: * wtf/ThreadingWin.cpp: * wtf/text/AtomicString.cpp: * wtf/text/AtomicStringImpl.cpp: (WTF::AtomicStringTableLocker::AtomicStringTableLocker): Tools: * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.vcxproj/TestWebKitAPI.vcxproj: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Lock.cpp: Added. (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/165908@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188169 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-07 22:38:59 +00:00
WTF::Lock should not suffer from the thundering herd https://bugs.webkit.org/show_bug.cgi?id=147947 Reviewed by Geoffrey Garen. Source/WTF: This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with doing this is that it's not obvious after calling unparkOne() if there are any other threads that are still parked on the lock's queue. If we assume that there are and leave the hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that if there aren't actually any threads parked. On the other hand, if we assume that there aren't any threads parked and clear the hasParkedBit, then if there actually were some threads parked, then they may never be awoken since future calls to unlock() won't take slow path and so won't call unparkOne(). In other words, we need a way to be very precise about when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case that we clear the bit just as some thread gets parked on the queue. A similar problem arises in futexes, and one of the solutions is to have a thread that acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked thread runs, then that barging thread will not know that there are threads parked. This could increase the severity of barging. Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security issues and so we can expose callbacks while ParkingLot is holding its internal locks. This change does exactly that for unparkOne(). The new variant of unparkOne() will call a user function while the queue from which we are unparking is locked. The callback is told basic stats about the queue: did we unpark a thread this time, and could there be more threads to unpark in the future. The callback runs while it's impossible for the queue state to change, since the ParkingLot's internal locks for the queue is held. This means that Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock inside the callback from unparkOne(). This takes care of the thundering herd problem while also reducing the greed that arises from barging threads. This required some careful reworking of the ParkingLot algorithm. The first thing I noticed was that the ThreadData::shouldPark flag was useless, since it's set exactly when ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create both hashtables and buckets, since the "callback is called while queue is locked" invariant requires that we didn't exit early due to the hashtable or bucket not being present. Note that all of this is done in such a way that the old unparkOne() and unparkAll() don't have to create any buckets, though they now may create the hashtable. We don't care as much about the hashtable being created by unpark since it's just such an unlikely scenario and it would only happen once. This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that benchmark. * benchmarks/LockSpeedTest.cpp: * wtf/Lock.cpp: (WTF::LockBase::unlockSlow): * wtf/Lock.h: (WTF::LockBase::isLocked): (WTF::LockBase::isFullyReset): * wtf/ParkingLot.cpp: (WTF::ParkingLot::parkConditionally): (WTF::ParkingLot::unparkOne): (WTF::ParkingLot::unparkAll): * wtf/ParkingLot.h: * wtf/WordLock.h: (WTF::WordLock::isLocked): (WTF::WordLock::isFullyReset): Tools: Add testing that checks that locks return to a pristine state after contention is over. * TestWebKitAPI/Tests/WTF/Lock.cpp: (TestWebKitAPI::LockInspector::isFullyReset): (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/166072@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
namespace TestWebKitAPI {
struct LockInspector;
}
WTF::Lock should not suffer from the thundering herd https://bugs.webkit.org/show_bug.cgi?id=147947 Reviewed by Geoffrey Garen. Source/WTF: This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with doing this is that it's not obvious after calling unparkOne() if there are any other threads that are still parked on the lock's queue. If we assume that there are and leave the hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that if there aren't actually any threads parked. On the other hand, if we assume that there aren't any threads parked and clear the hasParkedBit, then if there actually were some threads parked, then they may never be awoken since future calls to unlock() won't take slow path and so won't call unparkOne(). In other words, we need a way to be very precise about when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case that we clear the bit just as some thread gets parked on the queue. A similar problem arises in futexes, and one of the solutions is to have a thread that acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked thread runs, then that barging thread will not know that there are threads parked. This could increase the severity of barging. Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security issues and so we can expose callbacks while ParkingLot is holding its internal locks. This change does exactly that for unparkOne(). The new variant of unparkOne() will call a user function while the queue from which we are unparking is locked. The callback is told basic stats about the queue: did we unpark a thread this time, and could there be more threads to unpark in the future. The callback runs while it's impossible for the queue state to change, since the ParkingLot's internal locks for the queue is held. This means that Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock inside the callback from unparkOne(). This takes care of the thundering herd problem while also reducing the greed that arises from barging threads. This required some careful reworking of the ParkingLot algorithm. The first thing I noticed was that the ThreadData::shouldPark flag was useless, since it's set exactly when ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create both hashtables and buckets, since the "callback is called while queue is locked" invariant requires that we didn't exit early due to the hashtable or bucket not being present. Note that all of this is done in such a way that the old unparkOne() and unparkAll() don't have to create any buckets, though they now may create the hashtable. We don't care as much about the hashtable being created by unpark since it's just such an unlikely scenario and it would only happen once. This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that benchmark. * benchmarks/LockSpeedTest.cpp: * wtf/Lock.cpp: (WTF::LockBase::unlockSlow): * wtf/Lock.h: (WTF::LockBase::isLocked): (WTF::LockBase::isFullyReset): * wtf/ParkingLot.cpp: (WTF::ParkingLot::parkConditionally): (WTF::ParkingLot::unparkOne): (WTF::ParkingLot::unparkAll): * wtf/ParkingLot.h: * wtf/WordLock.h: (WTF::WordLock::isLocked): (WTF::WordLock::isFullyReset): Tools: Add testing that checks that locks return to a pristine state after contention is over. * TestWebKitAPI/Tests/WTF/Lock.cpp: (TestWebKitAPI::LockInspector::isFullyReset): (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/166072@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
Lightweight locks should be adaptive https://bugs.webkit.org/show_bug.cgi?id=147545 Reviewed by Geoffrey Garen. Source/JavaScriptCore: * dfg/DFGCommon.cpp: (JSC::DFG::startCrashing): * heap/CopiedBlock.h: (JSC::CopiedBlock::workListLock): * heap/CopiedBlockInlines.h: (JSC::CopiedBlock::shouldReportLiveBytes): (JSC::CopiedBlock::reportLiveBytes): * heap/CopiedSpace.cpp: (JSC::CopiedSpace::doneFillingBlock): * heap/CopiedSpace.h: (JSC::CopiedSpace::CopiedGeneration::CopiedGeneration): * heap/CopiedSpaceInlines.h: (JSC::CopiedSpace::recycleEvacuatedBlock): * heap/GCThreadSharedData.cpp: (JSC::GCThreadSharedData::didStartCopying): * heap/GCThreadSharedData.h: (JSC::GCThreadSharedData::getNextBlocksToCopy): * heap/ListableHandler.h: (JSC::ListableHandler::List::addThreadSafe): (JSC::ListableHandler::List::addNotThreadSafe): * heap/MachineStackMarker.cpp: (JSC::MachineThreads::tryCopyOtherThreadStacks): * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::copyLater): * parser/SourceProvider.cpp: (JSC::SourceProvider::~SourceProvider): (JSC::SourceProvider::getID): * profiler/ProfilerDatabase.cpp: (JSC::Profiler::Database::addDatabaseToAtExit): (JSC::Profiler::Database::removeDatabaseFromAtExit): (JSC::Profiler::Database::removeFirstAtExitDatabase): * runtime/TypeProfilerLog.h: Source/WebCore: * bindings/objc/WebScriptObject.mm: (WebCore::getJSWrapper): (WebCore::addJSWrapper): (WebCore::removeJSWrapper): (WebCore::removeJSWrapperIfRetainCountOne): * platform/audio/mac/CARingBuffer.cpp: (WebCore::CARingBuffer::setCurrentFrameBounds): (WebCore::CARingBuffer::getCurrentFrameBounds): * platform/audio/mac/CARingBuffer.h: * platform/ios/wak/WAKWindow.mm: (-[WAKWindow setExposedScrollViewRect:]): (-[WAKWindow exposedScrollViewRect]): Source/WebKit2: * WebProcess/WebPage/EventDispatcher.cpp: (WebKit::EventDispatcher::clearQueuedTouchEventsForPage): (WebKit::EventDispatcher::getQueuedTouchEventsForPage): (WebKit::EventDispatcher::touchEvent): (WebKit::EventDispatcher::dispatchTouchEvents): * WebProcess/WebPage/EventDispatcher.h: * WebProcess/WebPage/ViewUpdateDispatcher.cpp: (WebKit::ViewUpdateDispatcher::visibleContentRectUpdate): (WebKit::ViewUpdateDispatcher::dispatchVisibleContentRectUpdate): * WebProcess/WebPage/ViewUpdateDispatcher.h: Source/WTF: A common idiom in WebKit is to use spinlocks. We use them because the lock acquisition overhead is lower than system locks and because they take dramatically less space than system locks. The speed and space advantages of spinlocks can be astonishing: an uncontended spinlock acquire is up to 10x faster and under microcontention - short critical section with two or more threads taking turns - spinlocks are up to 100x faster. Spinlocks take only 1 byte or 4 bytes depending on the flavor, while system locks take 64 bytes or more. Clearly, WebKit should continue to avoid system locks - they are just far too slow and far too big. But there is a problem with this idiom. System lock implementations will sleep a thread when it attempts to acquire a lock that is held, while spinlocks will cause the thread to burn CPU. In WebKit spinlocks, the thread will repeatedly call sched_yield(). This is awesome for microcontention, but awful when the lock will not be released for a while. In fact, when critical sections take tens of microseconds or more, the CPU time cost of our spinlocks is almost 100x more than the CPU time cost of a system lock. This case doesn't arise too frequently in our current uses of spinlocks, but that's probably because right now there are places where we make a conscious decision to use system locks - even though they use more memory and are slower - because we don't want to waste CPU cycles when a thread has to wait a while to acquire the lock. The solution is to just implement a modern adaptive mutex in WTF. Luckily, this isn't a new concept. This patch implements a mutex that is reminiscent of the kinds of low-overhead locks that JVMs use. The actual implementation here is inspired by some of the ideas from [1]. The idea is simple: the fast path is an inlined CAS to immediately acquire a lock that isn't held, the slow path tries some number of spins to acquire the lock, and if that fails, the thread is put on a queue and put to sleep. The queue is made up of statically allocated thread nodes and the lock itself is a tagged pointer: either it is just bits telling us the complete lock state (not held or held) or it is a pointer to the head of a queue of threads waiting to acquire the lock. This approach gives WTF::Lock three different levels of adaptation: an inlined fast path if the lock is not contended, a short burst of spinning for microcontention, and a full-blown queue for critical sections that are held for a long time. On a locking microbenchmark, this new Lock exhibits the following performance characteristics: - Lock+unlock on an uncontended no-op critical section: 2x slower than SpinLock and 3x faster than a system mutex. - Lock+unlock on a contended no-op critical section: 2x slower than SpinLock and 100x faster than a system mutex. - CPU time spent in lock() on a lock held for a while: same as system mutex, 90x less than a SpinLock. - Memory usage: sizeof(void*), so on 64-bit it's 8x less than a system mutex but 2x worse than a SpinLock. This patch replaces all uses of SpinLock with Lock, since our critical sections are not no-ops so if you do basically anything in your critical section, the Lock overhead will be invisible. Also, in all places where we used SpinLock, we could tolerate 8 bytes of overhead instead of 4. Performance benchmarking using JSC macrobenchmarks shows no difference, which is as it should be: the purpose of this change is to reduce CPU time wasted, not wallclock time. This patch doesn't replace any uses of ByteSpinLock, since we expect that the space benefits of having a lock that just uses a byte are still better than the CPU wastage benefits of Lock. But, this work will enable some future work to create locks that will fit in just 1.6 bits: https://bugs.webkit.org/show_bug.cgi?id=147665. Rolling this back in after fixing Lock::unlockSlow() for architectures that have a truly weak CAS. Since the Lock::unlock() fast path can go to slow path spuriously, it may go there even if there aren't any threads on the Lock's queue. So, unlockSlow() must be able to deal with the possibility of a null queue head. [1] http://www.filpizlo.com/papers/pizlo-pppj2011-fable.pdf * WTF.vcxproj/WTF.vcxproj: * WTF.xcodeproj/project.pbxproj: * benchmarks: Added. * benchmarks/LockSpeedTest.cpp: Added. (main): * wtf/Atomics.h: (WTF::Atomic::compareExchangeWeak): (WTF::Atomic::compareExchangeStrong): * wtf/CMakeLists.txt: * wtf/Lock.cpp: Added. (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): * wtf/Lock.h: Added. (WTF::LockBase::lock): (WTF::LockBase::unlock): (WTF::LockBase::isHeld): (WTF::LockBase::isLocked): (WTF::Lock::Lock): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocatorHandle::shrink): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): (WTF::MetaAllocator::addFreshFreeSpace): (WTF::MetaAllocator::debugFreeSpaceSize): * wtf/MetaAllocator.h: * wtf/SpinLock.h: * wtf/ThreadingPthreads.cpp: * wtf/ThreadingWin.cpp: * wtf/text/AtomicString.cpp: * wtf/text/AtomicStringImpl.cpp: (WTF::AtomicStringTableLocker::AtomicStringTableLocker): Tools: * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.vcxproj/TestWebKitAPI.vcxproj: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Lock.cpp: Added. (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/165908@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188169 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-07 22:38:59 +00:00
namespace WTF {
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
typedef LockAlgorithm<uint8_t, 1, 2> DefaultLockAlgorithm;
Always use a byte-sized lock implementation https://bugs.webkit.org/show_bug.cgi?id=147908 Reviewed by Geoffrey Garen. Source/JavaScriptCore: * runtime/ConcurrentJITLock.h: Lock is now byte-sized and ByteLock is gone, so use Lock. Source/WTF: At the start of my locking algorithm crusade, I implemented Lock, which is a sizeof(void*) lock implementation with some nice theoretical properties and good performance. Then I added the ParkingLot abstraction and ByteLock. ParkingLot uses Lock in its implementation. ByteLock uses ParkingLot to create a sizeof(char) lock implementation that performs like Lock. It turns out that ByteLock is always at least as good as Lock, and sometimes a lot better: it requires 8x less memory on 64-bit systems. It's hard to construct a benchmark where ByteLock is significantly slower than Lock, and when you do construct such a benchmark, tweaking it a bit can also create a scenario where ByteLock is significantly faster than Lock. So, the thing that we call "Lock" should really use ByteLock's algorithm, since it is more compact and just as fast. That's what this patch does. But we still need to keep the old Lock algorithm, because it's used to implement ParkingLot, which in turn is used to implement ByteLock. So this patch does this transformation: - Move the algorithm in Lock into files called WordLock.h|cpp. Make ParkingLot use WordLock. - Move the algorithm in ByteLock into Lock.h|cpp. Make everyone who used ByteLock use Lock instead. All other users of Lock now get the byte-sized lock implementation. - Remove the old ByteLock files. * WTF.vcxproj/WTF.vcxproj: * WTF.xcodeproj/project.pbxproj: * benchmarks/LockSpeedTest.cpp: (main): * wtf/WordLock.cpp: Added. (WTF::WordLock::lockSlow): (WTF::WordLock::unlockSlow): * wtf/WordLock.h: Added. (WTF::WordLock::WordLock): (WTF::WordLock::lock): (WTF::WordLock::unlock): (WTF::WordLock::isHeld): (WTF::WordLock::isLocked): * wtf/ByteLock.cpp: Removed. * wtf/ByteLock.h: Removed. * wtf/CMakeLists.txt: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::unlock): (WTF::LockBase::isHeld): (WTF::LockBase::isLocked): (WTF::Lock::Lock): * wtf/ParkingLot.cpp: Tools: All previous tests of Lock are now tests of WordLock. All previous tests of ByteLock are now tests of Lock. * TestWebKitAPI/Tests/WTF/Lock.cpp: (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/166025@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188323 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-12 04:20:24 +00:00
// This is a fully adaptive mutex that only requires 1 byte of storage. It has fast paths that are
// competetive to a spinlock (uncontended locking is inlined and is just a CAS, microcontention is
// handled by spinning and yielding), and a slow path that is competetive to std::mutex (if a lock
WTF::Lock should be fair eventually https://bugs.webkit.org/show_bug.cgi?id=159384 Reviewed by Geoffrey Garen. Source/WTF: In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of locks makes them fast. That post presented lock fairness as a trade-off between two extremes: - Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a thread on the queue. If there was a thread on the queue, the lock is released and that thread is made runnable. That thread may then grab the lock, or some other thread may grab the lock first (it may barge). Usually, the barging thread is the thread that released the lock in the first place. This maximizes throughput but hurts fairness. There is no good theoretical bound on how unfair the lock may become, but empirical data suggests that it's fair enough for the cases we previously measured. - FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock() if there is a thread waiting. If there is a thread waiting, unlock() will make that thread runnable and inform it that it now holds the lock. This ensures perfect round-robin fairness and allows us to reason theoretically about how long it may take for a thread to grab the lock. For example, if we know that only N threads are running and each one may contend on a critical section, and each one may hold the lock for at most S seconds, then the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly in most cases. This is because for the common case of short critical sections, they force a context switch after each critical section if the lock is contended. This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging. Thanks to this new algorithm, you can now have both of these things at the same time. This change makes WTF::Lock eventually fair. We can almost (more on the caveats below) guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical sections that are longer than 1ms are always fair. For shorter critical sections, the amount of time that any thread waits is 1ms times the number of threads. There are some caveats that arise from our use of randomness, but even then, in the limit as the critical section length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our experiments show that the lock becomes exactly as fair as a FIFO lock for any critical section that is 1ms or longer. The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair. ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each other via a token. unparkOne() can pass a token, which parkConditionally() will return. This change also makes parkConditionally() a lot more precise about when it was unparked due to a call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is guaranteed to report that it was unparked rather than timing out, and that thread is guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing fair unlocking. In that case, unlock() will not release the lock, and lock() will know that it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock can make a decision about unlock strategy and inject a token while it has complete knowledge over the state of the queue. As such, it's not immediately obvious how to implement this algorithm on top of futexes. You really need ParkingLot! WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(), which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms at random. When a dequeue happens and there are threads that actually get dequeued, we check if the time since the last unfair unlock (the last time timeToBeFair was set to true) is more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout. This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to happen at least once per millisecond. It will happen at 2 KHz on average. If there are collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment of everyone else. Randomness ensures that we aren't too fair to any one thread. Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to allow only one thread to hold the lock during a whole second in which each thread is holding the lock for 1ms at a time. This is because in a barging lock, releasing a lock after holding it for 1ms and then reacquiring it immediately virtually ensures that none of the other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock handles this case like a champ: each thread gets equal turns. Here's some data. If we launch 10 threads and have each of them run for 1 second while repeatedly holding a critical section for 1ms, then here's how many times each thread gets to hold the lock using the old WTF::Lock algorithm: 799, 6, 1, 1, 1, 1, 1, 1, 1, 1 One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock becomes totally fair: 80, 79, 79, 79, 79, 79, 79, 80, 80, 79 I don't know of anyone creating such an automatically-fair adaptive lock before, so I think that this is a pretty awesome advancement to the state of the art! This change is good for three reasons: - We do have long critical sections in WebKit and we don't want to have to worry about starvation. This reduces the likelihood that we will see starvation due to our lock strategy. - I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or lockFairly() or some moral equivalent for the scavenger thread. - If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability to unlock and relock without barging. * benchmarks/LockFairnessTest.cpp: (main): * benchmarks/ToyLocks.h: * wtf/Condition.h: (WTF::ConditionBase::waitUntil): (WTF::ConditionBase::notifyOne): * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): * wtf/Lock.h: (WTF::LockBase::try_lock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (WTF::LockBase::isFullyReset): * wtf/ParkingLot.cpp: (WTF::ParkingLot::parkConditionallyImpl): (WTF::ParkingLot::unparkOne): (WTF::ParkingLot::unparkOneImpl): (WTF::ParkingLot::unparkAll): * wtf/ParkingLot.h: (WTF::ParkingLot::parkConditionally): (WTF::ParkingLot::compareAndPark): (WTF::ParkingLot::unparkOne): Tools: * TestWebKitAPI/Tests/WTF/ParkingLot.cpp: Canonical link: https://commits.webkit.org/178039@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
// cannot be acquired in a short period of time, the thread is put to sleep until the lock is
// available again). It uses less memory than a std::mutex. This lock guarantees eventual stochastic
// fairness, even in programs that relock the lock immediately after unlocking it. Except when there
// are collisions between this lock and other locks in the ParkingLot, this lock will guarantee that
// at worst one call to unlock() per millisecond will do a direct hand-off to the thread that is at
// the head of the queue. When there are collisions, each collision increases the fair unlock delay
// by one millisecond in the worst case.
//
// This lock type supports thread safety analysis.
// To annotate a member variable or a global variable with thread ownership information,
// use lock capability annotations defined in ThreadSafetyAnalysis.h.
class WTF_CAPABILITY_LOCK Lock {
WTF_MAKE_NONCOPYABLE(Lock);
[WTF] Remove XXXLockBase since constexpr constructor can initialize static variables without calling global constructors https://bugs.webkit.org/show_bug.cgi?id=180495 Reviewed by Mark Lam. Very nice feature of C++11 is that constexpr constructor can initialize static global variables without calling global constructors. We do not need to have XXXLockBase with derived XXXLock class since StaticXXXLock can have constructors as long as it is constexpr. We remove bunch of these classes, and set `XXXLock() = default;` explicitly for readability. C++11's default constructor is constexpr as long as its member's default constructor / default initializer is constexpr. * wtf/Condition.h: (WTF::ConditionBase::construct): Deleted. (WTF::ConditionBase::waitUntil): Deleted. (WTF::ConditionBase::waitFor): Deleted. (WTF::ConditionBase::wait): Deleted. (WTF::ConditionBase::notifyOne): Deleted. (WTF::ConditionBase::notifyAll): Deleted. (WTF::Condition::Condition): Deleted. * wtf/CountingLock.h: (WTF::CountingLock::CountingLock): Deleted. (WTF::CountingLock::~CountingLock): Deleted. * wtf/Lock.cpp: (WTF::Lock::lockSlow): (WTF::Lock::unlockSlow): (WTF::Lock::unlockFairlySlow): (WTF::Lock::safepointSlow): (WTF::LockBase::lockSlow): Deleted. (WTF::LockBase::unlockSlow): Deleted. (WTF::LockBase::unlockFairlySlow): Deleted. (WTF::LockBase::safepointSlow): Deleted. * wtf/Lock.h: (WTF::LockBase::construct): Deleted. (WTF::LockBase::lock): Deleted. (WTF::LockBase::tryLock): Deleted. (WTF::LockBase::try_lock): Deleted. (WTF::LockBase::unlock): Deleted. (WTF::LockBase::unlockFairly): Deleted. (WTF::LockBase::safepoint): Deleted. (WTF::LockBase::isHeld const): Deleted. (WTF::LockBase::isLocked const): Deleted. (WTF::LockBase::isFullyReset const): Deleted. (WTF::Lock::Lock): Deleted. * wtf/ReadWriteLock.cpp: (WTF::ReadWriteLock::readLock): (WTF::ReadWriteLock::readUnlock): (WTF::ReadWriteLock::writeLock): (WTF::ReadWriteLock::writeUnlock): (WTF::ReadWriteLockBase::construct): Deleted. (WTF::ReadWriteLockBase::readLock): Deleted. (WTF::ReadWriteLockBase::readUnlock): Deleted. (WTF::ReadWriteLockBase::writeLock): Deleted. (WTF::ReadWriteLockBase::writeUnlock): Deleted. * wtf/ReadWriteLock.h: (WTF::ReadWriteLock::read): (WTF::ReadWriteLock::write): (WTF::ReadWriteLockBase::ReadLock::tryLock): Deleted. (WTF::ReadWriteLockBase::ReadLock::lock): Deleted. (WTF::ReadWriteLockBase::ReadLock::unlock): Deleted. (WTF::ReadWriteLockBase::WriteLock::tryLock): Deleted. (WTF::ReadWriteLockBase::WriteLock::lock): Deleted. (WTF::ReadWriteLockBase::WriteLock::unlock): Deleted. (WTF::ReadWriteLockBase::read): Deleted. (WTF::ReadWriteLockBase::write): Deleted. (WTF::ReadWriteLock::ReadWriteLock): Deleted. * wtf/RecursiveLockAdapter.h: (WTF::RecursiveLockAdapter::RecursiveLockAdapter): Deleted. * wtf/WordLock.cpp: (WTF::WordLock::lockSlow): (WTF::WordLock::unlockSlow): (WTF::WordLockBase::lockSlow): Deleted. (WTF::WordLockBase::unlockSlow): Deleted. * wtf/WordLock.h: (WTF::WordLockBase::lock): Deleted. (WTF::WordLockBase::unlock): Deleted. (WTF::WordLockBase::isHeld const): Deleted. (WTF::WordLockBase::isLocked const): Deleted. (WTF::WordLockBase::isFullyReset const): Deleted. (WTF::WordLock::WordLock): Deleted. * wtf/WorkQueue.cpp: Canonical link: https://commits.webkit.org/196438@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225617 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-07 03:52:09 +00:00
WTF_MAKE_FAST_ALLOCATED;
public:
constexpr Lock() = default;
WTF should have a ParkingLot for parking sleeping threads, so that locks can fit in 1.6 bits https://bugs.webkit.org/show_bug.cgi?id=147665 Reviewed by Mark Lam. Source/JavaScriptCore: Replace ByteSpinLock with ByteLock. * runtime/ConcurrentJITLock.h: Source/WTF: This change adds a major new abstraction for concurrency algorithms in WebKit. It's called a ParkingLot, and it makes available a thread parking queue for each virtual address in memory. The queues are maintained by a data-access-parallel concurrent hashtable implementation. The memory usage is bounded at around half a KB per thread. The ParkingLot makes it easy to turn any spinlock-based concurrency protocol into one that parks threads after a while. Because queue state management is up to the ParkingLot and not the user's data structure, this patch uses it to implement a full adaptive mutex in one byte. In fact, only three states of that byte are used (0 = available, 1 = locked, 2 = locked and there are parked threads). Hence the joke that ParkingLot allows locks that fit in 1.6 bits. ByteLock is used as a replacement for ByteSpinLock in JavaScriptCore. The API tests for this also demo how to create a completely fair (FIFO) binary semamphore. The comment in Lock.h shows how we could accelerate Lock performance using ParkingLot. After we are sure that this code works, we can expand the use of ParkingLot. That's covered by https://bugs.webkit.org/show_bug.cgi?id=147841. * WTF.vcxproj/WTF.vcxproj: * WTF.xcodeproj/project.pbxproj: * benchmarks/LockSpeedTest.cpp: (main): * wtf/Atomics.h: (WTF::Atomic::compareExchangeWeak): (WTF::Atomic::compareExchangeStrong): * wtf/ByteLock.cpp: Added. (WTF::ByteLock::lockSlow): (WTF::ByteLock::unlockSlow): * wtf/ByteLock.h: Added. (WTF::ByteLock::ByteLock): (WTF::ByteLock::lock): (WTF::ByteLock::unlock): (WTF::ByteLock::isHeld): (WTF::ByteLock::isLocked): * wtf/CMakeLists.txt: * wtf/Lock.h: * wtf/ParkingLot.cpp: Added. (WTF::ParkingLot::parkConditionally): (WTF::ParkingLot::unparkOne): (WTF::ParkingLot::unparkAll): (WTF::ParkingLot::forEach): * wtf/ParkingLot.h: Added. (WTF::ParkingLot::compareAndPark): Tools: * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.vcxproj/TestWebKitAPI.vcxproj: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Lock.cpp: (TestWebKitAPI::TEST): * TestWebKitAPI/Tests/WTF/ParkingLot.cpp: Added. (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/165996@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188280 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-11 19:51:35 +00:00
void lock() WTF_ACQUIRES_LOCK()
Lightweight locks should be adaptive https://bugs.webkit.org/show_bug.cgi?id=147545 Reviewed by Geoffrey Garen. Source/JavaScriptCore: * dfg/DFGCommon.cpp: (JSC::DFG::startCrashing): * heap/CopiedBlock.h: (JSC::CopiedBlock::workListLock): * heap/CopiedBlockInlines.h: (JSC::CopiedBlock::shouldReportLiveBytes): (JSC::CopiedBlock::reportLiveBytes): * heap/CopiedSpace.cpp: (JSC::CopiedSpace::doneFillingBlock): * heap/CopiedSpace.h: (JSC::CopiedSpace::CopiedGeneration::CopiedGeneration): * heap/CopiedSpaceInlines.h: (JSC::CopiedSpace::recycleEvacuatedBlock): * heap/GCThreadSharedData.cpp: (JSC::GCThreadSharedData::didStartCopying): * heap/GCThreadSharedData.h: (JSC::GCThreadSharedData::getNextBlocksToCopy): * heap/ListableHandler.h: (JSC::ListableHandler::List::addThreadSafe): (JSC::ListableHandler::List::addNotThreadSafe): * heap/MachineStackMarker.cpp: (JSC::MachineThreads::tryCopyOtherThreadStacks): * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::copyLater): * parser/SourceProvider.cpp: (JSC::SourceProvider::~SourceProvider): (JSC::SourceProvider::getID): * profiler/ProfilerDatabase.cpp: (JSC::Profiler::Database::addDatabaseToAtExit): (JSC::Profiler::Database::removeDatabaseFromAtExit): (JSC::Profiler::Database::removeFirstAtExitDatabase): * runtime/TypeProfilerLog.h: Source/WebCore: * bindings/objc/WebScriptObject.mm: (WebCore::getJSWrapper): (WebCore::addJSWrapper): (WebCore::removeJSWrapper): (WebCore::removeJSWrapperIfRetainCountOne): * platform/audio/mac/CARingBuffer.cpp: (WebCore::CARingBuffer::setCurrentFrameBounds): (WebCore::CARingBuffer::getCurrentFrameBounds): * platform/audio/mac/CARingBuffer.h: * platform/ios/wak/WAKWindow.mm: (-[WAKWindow setExposedScrollViewRect:]): (-[WAKWindow exposedScrollViewRect]): Source/WebKit2: * WebProcess/WebPage/EventDispatcher.cpp: (WebKit::EventDispatcher::clearQueuedTouchEventsForPage): (WebKit::EventDispatcher::getQueuedTouchEventsForPage): (WebKit::EventDispatcher::touchEvent): (WebKit::EventDispatcher::dispatchTouchEvents): * WebProcess/WebPage/EventDispatcher.h: * WebProcess/WebPage/ViewUpdateDispatcher.cpp: (WebKit::ViewUpdateDispatcher::visibleContentRectUpdate): (WebKit::ViewUpdateDispatcher::dispatchVisibleContentRectUpdate): * WebProcess/WebPage/ViewUpdateDispatcher.h: Source/WTF: A common idiom in WebKit is to use spinlocks. We use them because the lock acquisition overhead is lower than system locks and because they take dramatically less space than system locks. The speed and space advantages of spinlocks can be astonishing: an uncontended spinlock acquire is up to 10x faster and under microcontention - short critical section with two or more threads taking turns - spinlocks are up to 100x faster. Spinlocks take only 1 byte or 4 bytes depending on the flavor, while system locks take 64 bytes or more. Clearly, WebKit should continue to avoid system locks - they are just far too slow and far too big. But there is a problem with this idiom. System lock implementations will sleep a thread when it attempts to acquire a lock that is held, while spinlocks will cause the thread to burn CPU. In WebKit spinlocks, the thread will repeatedly call sched_yield(). This is awesome for microcontention, but awful when the lock will not be released for a while. In fact, when critical sections take tens of microseconds or more, the CPU time cost of our spinlocks is almost 100x more than the CPU time cost of a system lock. This case doesn't arise too frequently in our current uses of spinlocks, but that's probably because right now there are places where we make a conscious decision to use system locks - even though they use more memory and are slower - because we don't want to waste CPU cycles when a thread has to wait a while to acquire the lock. The solution is to just implement a modern adaptive mutex in WTF. Luckily, this isn't a new concept. This patch implements a mutex that is reminiscent of the kinds of low-overhead locks that JVMs use. The actual implementation here is inspired by some of the ideas from [1]. The idea is simple: the fast path is an inlined CAS to immediately acquire a lock that isn't held, the slow path tries some number of spins to acquire the lock, and if that fails, the thread is put on a queue and put to sleep. The queue is made up of statically allocated thread nodes and the lock itself is a tagged pointer: either it is just bits telling us the complete lock state (not held or held) or it is a pointer to the head of a queue of threads waiting to acquire the lock. This approach gives WTF::Lock three different levels of adaptation: an inlined fast path if the lock is not contended, a short burst of spinning for microcontention, and a full-blown queue for critical sections that are held for a long time. On a locking microbenchmark, this new Lock exhibits the following performance characteristics: - Lock+unlock on an uncontended no-op critical section: 2x slower than SpinLock and 3x faster than a system mutex. - Lock+unlock on a contended no-op critical section: 2x slower than SpinLock and 100x faster than a system mutex. - CPU time spent in lock() on a lock held for a while: same as system mutex, 90x less than a SpinLock. - Memory usage: sizeof(void*), so on 64-bit it's 8x less than a system mutex but 2x worse than a SpinLock. This patch replaces all uses of SpinLock with Lock, since our critical sections are not no-ops so if you do basically anything in your critical section, the Lock overhead will be invisible. Also, in all places where we used SpinLock, we could tolerate 8 bytes of overhead instead of 4. Performance benchmarking using JSC macrobenchmarks shows no difference, which is as it should be: the purpose of this change is to reduce CPU time wasted, not wallclock time. This patch doesn't replace any uses of ByteSpinLock, since we expect that the space benefits of having a lock that just uses a byte are still better than the CPU wastage benefits of Lock. But, this work will enable some future work to create locks that will fit in just 1.6 bits: https://bugs.webkit.org/show_bug.cgi?id=147665. Rolling this back in after fixing Lock::unlockSlow() for architectures that have a truly weak CAS. Since the Lock::unlock() fast path can go to slow path spuriously, it may go there even if there aren't any threads on the Lock's queue. So, unlockSlow() must be able to deal with the possibility of a null queue head. [1] http://www.filpizlo.com/papers/pizlo-pppj2011-fable.pdf * WTF.vcxproj/WTF.vcxproj: * WTF.xcodeproj/project.pbxproj: * benchmarks: Added. * benchmarks/LockSpeedTest.cpp: Added. (main): * wtf/Atomics.h: (WTF::Atomic::compareExchangeWeak): (WTF::Atomic::compareExchangeStrong): * wtf/CMakeLists.txt: * wtf/Lock.cpp: Added. (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): * wtf/Lock.h: Added. (WTF::LockBase::lock): (WTF::LockBase::unlock): (WTF::LockBase::isHeld): (WTF::LockBase::isLocked): (WTF::Lock::Lock): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocatorHandle::shrink): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): (WTF::MetaAllocator::addFreshFreeSpace): (WTF::MetaAllocator::debugFreeSpaceSize): * wtf/MetaAllocator.h: * wtf/SpinLock.h: * wtf/ThreadingPthreads.cpp: * wtf/ThreadingWin.cpp: * wtf/text/AtomicString.cpp: * wtf/text/AtomicStringImpl.cpp: (WTF::AtomicStringTableLocker::AtomicStringTableLocker): Tools: * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.vcxproj/TestWebKitAPI.vcxproj: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Lock.cpp: Added. (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/165908@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188169 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-07 22:38:59 +00:00
{
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
if (UNLIKELY(!DefaultLockAlgorithm::lockFastAssumingZero(m_byte)))
lockSlow();
Lightweight locks should be adaptive https://bugs.webkit.org/show_bug.cgi?id=147545 Reviewed by Geoffrey Garen. Source/JavaScriptCore: * dfg/DFGCommon.cpp: (JSC::DFG::startCrashing): * heap/CopiedBlock.h: (JSC::CopiedBlock::workListLock): * heap/CopiedBlockInlines.h: (JSC::CopiedBlock::shouldReportLiveBytes): (JSC::CopiedBlock::reportLiveBytes): * heap/CopiedSpace.cpp: (JSC::CopiedSpace::doneFillingBlock): * heap/CopiedSpace.h: (JSC::CopiedSpace::CopiedGeneration::CopiedGeneration): * heap/CopiedSpaceInlines.h: (JSC::CopiedSpace::recycleEvacuatedBlock): * heap/GCThreadSharedData.cpp: (JSC::GCThreadSharedData::didStartCopying): * heap/GCThreadSharedData.h: (JSC::GCThreadSharedData::getNextBlocksToCopy): * heap/ListableHandler.h: (JSC::ListableHandler::List::addThreadSafe): (JSC::ListableHandler::List::addNotThreadSafe): * heap/MachineStackMarker.cpp: (JSC::MachineThreads::tryCopyOtherThreadStacks): * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::copyLater): * parser/SourceProvider.cpp: (JSC::SourceProvider::~SourceProvider): (JSC::SourceProvider::getID): * profiler/ProfilerDatabase.cpp: (JSC::Profiler::Database::addDatabaseToAtExit): (JSC::Profiler::Database::removeDatabaseFromAtExit): (JSC::Profiler::Database::removeFirstAtExitDatabase): * runtime/TypeProfilerLog.h: Source/WebCore: * bindings/objc/WebScriptObject.mm: (WebCore::getJSWrapper): (WebCore::addJSWrapper): (WebCore::removeJSWrapper): (WebCore::removeJSWrapperIfRetainCountOne): * platform/audio/mac/CARingBuffer.cpp: (WebCore::CARingBuffer::setCurrentFrameBounds): (WebCore::CARingBuffer::getCurrentFrameBounds): * platform/audio/mac/CARingBuffer.h: * platform/ios/wak/WAKWindow.mm: (-[WAKWindow setExposedScrollViewRect:]): (-[WAKWindow exposedScrollViewRect]): Source/WebKit2: * WebProcess/WebPage/EventDispatcher.cpp: (WebKit::EventDispatcher::clearQueuedTouchEventsForPage): (WebKit::EventDispatcher::getQueuedTouchEventsForPage): (WebKit::EventDispatcher::touchEvent): (WebKit::EventDispatcher::dispatchTouchEvents): * WebProcess/WebPage/EventDispatcher.h: * WebProcess/WebPage/ViewUpdateDispatcher.cpp: (WebKit::ViewUpdateDispatcher::visibleContentRectUpdate): (WebKit::ViewUpdateDispatcher::dispatchVisibleContentRectUpdate): * WebProcess/WebPage/ViewUpdateDispatcher.h: Source/WTF: A common idiom in WebKit is to use spinlocks. We use them because the lock acquisition overhead is lower than system locks and because they take dramatically less space than system locks. The speed and space advantages of spinlocks can be astonishing: an uncontended spinlock acquire is up to 10x faster and under microcontention - short critical section with two or more threads taking turns - spinlocks are up to 100x faster. Spinlocks take only 1 byte or 4 bytes depending on the flavor, while system locks take 64 bytes or more. Clearly, WebKit should continue to avoid system locks - they are just far too slow and far too big. But there is a problem with this idiom. System lock implementations will sleep a thread when it attempts to acquire a lock that is held, while spinlocks will cause the thread to burn CPU. In WebKit spinlocks, the thread will repeatedly call sched_yield(). This is awesome for microcontention, but awful when the lock will not be released for a while. In fact, when critical sections take tens of microseconds or more, the CPU time cost of our spinlocks is almost 100x more than the CPU time cost of a system lock. This case doesn't arise too frequently in our current uses of spinlocks, but that's probably because right now there are places where we make a conscious decision to use system locks - even though they use more memory and are slower - because we don't want to waste CPU cycles when a thread has to wait a while to acquire the lock. The solution is to just implement a modern adaptive mutex in WTF. Luckily, this isn't a new concept. This patch implements a mutex that is reminiscent of the kinds of low-overhead locks that JVMs use. The actual implementation here is inspired by some of the ideas from [1]. The idea is simple: the fast path is an inlined CAS to immediately acquire a lock that isn't held, the slow path tries some number of spins to acquire the lock, and if that fails, the thread is put on a queue and put to sleep. The queue is made up of statically allocated thread nodes and the lock itself is a tagged pointer: either it is just bits telling us the complete lock state (not held or held) or it is a pointer to the head of a queue of threads waiting to acquire the lock. This approach gives WTF::Lock three different levels of adaptation: an inlined fast path if the lock is not contended, a short burst of spinning for microcontention, and a full-blown queue for critical sections that are held for a long time. On a locking microbenchmark, this new Lock exhibits the following performance characteristics: - Lock+unlock on an uncontended no-op critical section: 2x slower than SpinLock and 3x faster than a system mutex. - Lock+unlock on a contended no-op critical section: 2x slower than SpinLock and 100x faster than a system mutex. - CPU time spent in lock() on a lock held for a while: same as system mutex, 90x less than a SpinLock. - Memory usage: sizeof(void*), so on 64-bit it's 8x less than a system mutex but 2x worse than a SpinLock. This patch replaces all uses of SpinLock with Lock, since our critical sections are not no-ops so if you do basically anything in your critical section, the Lock overhead will be invisible. Also, in all places where we used SpinLock, we could tolerate 8 bytes of overhead instead of 4. Performance benchmarking using JSC macrobenchmarks shows no difference, which is as it should be: the purpose of this change is to reduce CPU time wasted, not wallclock time. This patch doesn't replace any uses of ByteSpinLock, since we expect that the space benefits of having a lock that just uses a byte are still better than the CPU wastage benefits of Lock. But, this work will enable some future work to create locks that will fit in just 1.6 bits: https://bugs.webkit.org/show_bug.cgi?id=147665. Rolling this back in after fixing Lock::unlockSlow() for architectures that have a truly weak CAS. Since the Lock::unlock() fast path can go to slow path spuriously, it may go there even if there aren't any threads on the Lock's queue. So, unlockSlow() must be able to deal with the possibility of a null queue head. [1] http://www.filpizlo.com/papers/pizlo-pppj2011-fable.pdf * WTF.vcxproj/WTF.vcxproj: * WTF.xcodeproj/project.pbxproj: * benchmarks: Added. * benchmarks/LockSpeedTest.cpp: Added. (main): * wtf/Atomics.h: (WTF::Atomic::compareExchangeWeak): (WTF::Atomic::compareExchangeStrong): * wtf/CMakeLists.txt: * wtf/Lock.cpp: Added. (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): * wtf/Lock.h: Added. (WTF::LockBase::lock): (WTF::LockBase::unlock): (WTF::LockBase::isHeld): (WTF::LockBase::isLocked): (WTF::Lock::Lock): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocatorHandle::shrink): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): (WTF::MetaAllocator::addFreshFreeSpace): (WTF::MetaAllocator::debugFreeSpaceSize): * wtf/MetaAllocator.h: * wtf/SpinLock.h: * wtf/ThreadingPthreads.cpp: * wtf/ThreadingWin.cpp: * wtf/text/AtomicString.cpp: * wtf/text/AtomicStringImpl.cpp: (WTF::AtomicStringTableLocker::AtomicStringTableLocker): Tools: * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.vcxproj/TestWebKitAPI.vcxproj: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Lock.cpp: Added. (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/165908@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188169 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-07 22:38:59 +00:00
}
bool tryLock() WTF_ACQUIRES_LOCK_IF(true) // NOLINT: Intentional deviation to support std::scoped_lock.
Use WTF::Lock and WTF::Condition instead of WTF::Mutex, WTF::ThreadCondition, std::mutex, and std::condition_variable https://bugs.webkit.org/show_bug.cgi?id=147999 Reviewed by Geoffrey Garen. Source/JavaScriptCore: * API/JSVirtualMachine.mm: (initWrapperCache): (+[JSVMWrapperCache addWrapper:forJSContextGroupRef:]): (+[JSVMWrapperCache wrapperForJSContextGroupRef:]): (wrapperCacheMutex): Deleted. * bytecode/SamplingTool.cpp: (JSC::SamplingTool::doRun): (JSC::SamplingTool::notifyOfScope): * bytecode/SamplingTool.h: * dfg/DFGThreadData.h: * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::~Worklist): (JSC::DFG::Worklist::isActiveForVM): (JSC::DFG::Worklist::enqueue): (JSC::DFG::Worklist::compilationState): (JSC::DFG::Worklist::waitUntilAllPlansForVMAreReady): (JSC::DFG::Worklist::removeAllReadyPlansForVM): (JSC::DFG::Worklist::completeAllReadyPlansForVM): (JSC::DFG::Worklist::visitWeakReferences): (JSC::DFG::Worklist::removeDeadPlans): (JSC::DFG::Worklist::queueLength): (JSC::DFG::Worklist::dump): (JSC::DFG::Worklist::runThread): * dfg/DFGWorklist.h: * disassembler/Disassembler.cpp: * heap/CopiedSpace.cpp: (JSC::CopiedSpace::doneFillingBlock): (JSC::CopiedSpace::doneCopying): * heap/CopiedSpace.h: * heap/CopiedSpaceInlines.h: (JSC::CopiedSpace::recycleBorrowedBlock): (JSC::CopiedSpace::allocateBlockForCopyingPhase): * heap/GCThread.cpp: (JSC::GCThread::waitForNextPhase): (JSC::GCThread::gcThreadMain): * heap/GCThreadSharedData.cpp: (JSC::GCThreadSharedData::GCThreadSharedData): (JSC::GCThreadSharedData::~GCThreadSharedData): (JSC::GCThreadSharedData::startNextPhase): (JSC::GCThreadSharedData::endCurrentPhase): (JSC::GCThreadSharedData::didStartMarking): (JSC::GCThreadSharedData::didFinishMarking): * heap/GCThreadSharedData.h: * heap/HeapTimer.h: * heap/MachineStackMarker.cpp: (JSC::ActiveMachineThreadsManager::Locker::Locker): (JSC::ActiveMachineThreadsManager::add): (JSC::ActiveMachineThreadsManager::remove): (JSC::ActiveMachineThreadsManager::ActiveMachineThreadsManager): (JSC::MachineThreads::~MachineThreads): (JSC::MachineThreads::addCurrentThread): (JSC::MachineThreads::removeThreadIfFound): (JSC::MachineThreads::tryCopyOtherThreadStack): (JSC::MachineThreads::tryCopyOtherThreadStacks): (JSC::MachineThreads::gatherConservativeRoots): * heap/MachineStackMarker.h: * heap/SlotVisitor.cpp: (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::mergeOpaqueRoots): * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::containsOpaqueRootTriState): * inspector/remote/RemoteInspectorDebuggableConnection.h: * inspector/remote/RemoteInspectorDebuggableConnection.mm: (Inspector::RemoteInspectorHandleRunSourceGlobal): (Inspector::RemoteInspectorQueueTaskOnGlobalQueue): (Inspector::RemoteInspectorInitializeGlobalQueue): (Inspector::RemoteInspectorHandleRunSourceWithInfo): (Inspector::RemoteInspectorDebuggableConnection::setup): (Inspector::RemoteInspectorDebuggableConnection::closeFromDebuggable): (Inspector::RemoteInspectorDebuggableConnection::close): (Inspector::RemoteInspectorDebuggableConnection::sendMessageToBackend): (Inspector::RemoteInspectorDebuggableConnection::queueTaskOnPrivateRunLoop): * interpreter/JSStack.cpp: (JSC::JSStack::JSStack): (JSC::JSStack::releaseExcessCapacity): (JSC::JSStack::addToCommittedByteCount): (JSC::JSStack::committedByteCount): (JSC::stackStatisticsMutex): Deleted. (JSC::JSStack::initializeThreading): Deleted. * interpreter/JSStack.h: (JSC::JSStack::gatherConservativeRoots): (JSC::JSStack::sanitizeStack): (JSC::JSStack::size): (JSC::JSStack::initializeThreading): Deleted. * jit/ExecutableAllocator.cpp: (JSC::DemandExecutableAllocator::DemandExecutableAllocator): (JSC::DemandExecutableAllocator::~DemandExecutableAllocator): (JSC::DemandExecutableAllocator::bytesAllocatedByAllAllocators): (JSC::DemandExecutableAllocator::bytesCommittedByAllocactors): (JSC::DemandExecutableAllocator::dumpProfileFromAllAllocators): (JSC::DemandExecutableAllocator::allocators): (JSC::DemandExecutableAllocator::allocatorsMutex): * jit/JITThunks.cpp: (JSC::JITThunks::ctiStub): * jit/JITThunks.h: * profiler/ProfilerDatabase.cpp: (JSC::Profiler::Database::ensureBytecodesFor): (JSC::Profiler::Database::notifyDestruction): * profiler/ProfilerDatabase.h: * runtime/InitializeThreading.cpp: (JSC::initializeThreading): * runtime/JSLock.cpp: (JSC::GlobalJSLock::GlobalJSLock): (JSC::GlobalJSLock::~GlobalJSLock): (JSC::JSLockHolder::JSLockHolder): (JSC::GlobalJSLock::initialize): Deleted. * runtime/JSLock.h: Source/WTF: Relanding after fixing a deadlock on Linux. * wtf/Condition.h: "using WTF::Condition". * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): Add tryLock() because it turns out that we use it sometimes. (WTF::LockBase::try_lock): unique_lock needs this. (WTF::LockBase::unlock): * wtf/ParkingLot.cpp: (WTF::ParkingLot::parkConditionally): Work around a Linux C++ bug where wait_until with time_point::max() immediately returns and doesn't flash the lock. Canonical link: https://commits.webkit.org/166166@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188499 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-15 00:14:52 +00:00
{
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
return DefaultLockAlgorithm::tryLock(m_byte);
Use WTF::Lock and WTF::Condition instead of WTF::Mutex, WTF::ThreadCondition, std::mutex, and std::condition_variable https://bugs.webkit.org/show_bug.cgi?id=147999 Reviewed by Geoffrey Garen. Source/JavaScriptCore: * API/JSVirtualMachine.mm: (initWrapperCache): (+[JSVMWrapperCache addWrapper:forJSContextGroupRef:]): (+[JSVMWrapperCache wrapperForJSContextGroupRef:]): (wrapperCacheMutex): Deleted. * bytecode/SamplingTool.cpp: (JSC::SamplingTool::doRun): (JSC::SamplingTool::notifyOfScope): * bytecode/SamplingTool.h: * dfg/DFGThreadData.h: * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::~Worklist): (JSC::DFG::Worklist::isActiveForVM): (JSC::DFG::Worklist::enqueue): (JSC::DFG::Worklist::compilationState): (JSC::DFG::Worklist::waitUntilAllPlansForVMAreReady): (JSC::DFG::Worklist::removeAllReadyPlansForVM): (JSC::DFG::Worklist::completeAllReadyPlansForVM): (JSC::DFG::Worklist::visitWeakReferences): (JSC::DFG::Worklist::removeDeadPlans): (JSC::DFG::Worklist::queueLength): (JSC::DFG::Worklist::dump): (JSC::DFG::Worklist::runThread): * dfg/DFGWorklist.h: * disassembler/Disassembler.cpp: * heap/CopiedSpace.cpp: (JSC::CopiedSpace::doneFillingBlock): (JSC::CopiedSpace::doneCopying): * heap/CopiedSpace.h: * heap/CopiedSpaceInlines.h: (JSC::CopiedSpace::recycleBorrowedBlock): (JSC::CopiedSpace::allocateBlockForCopyingPhase): * heap/GCThread.cpp: (JSC::GCThread::waitForNextPhase): (JSC::GCThread::gcThreadMain): * heap/GCThreadSharedData.cpp: (JSC::GCThreadSharedData::GCThreadSharedData): (JSC::GCThreadSharedData::~GCThreadSharedData): (JSC::GCThreadSharedData::startNextPhase): (JSC::GCThreadSharedData::endCurrentPhase): (JSC::GCThreadSharedData::didStartMarking): (JSC::GCThreadSharedData::didFinishMarking): * heap/GCThreadSharedData.h: * heap/HeapTimer.h: * heap/MachineStackMarker.cpp: (JSC::ActiveMachineThreadsManager::Locker::Locker): (JSC::ActiveMachineThreadsManager::add): (JSC::ActiveMachineThreadsManager::remove): (JSC::ActiveMachineThreadsManager::ActiveMachineThreadsManager): (JSC::MachineThreads::~MachineThreads): (JSC::MachineThreads::addCurrentThread): (JSC::MachineThreads::removeThreadIfFound): (JSC::MachineThreads::tryCopyOtherThreadStack): (JSC::MachineThreads::tryCopyOtherThreadStacks): (JSC::MachineThreads::gatherConservativeRoots): * heap/MachineStackMarker.h: * heap/SlotVisitor.cpp: (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::mergeOpaqueRoots): * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::containsOpaqueRootTriState): * inspector/remote/RemoteInspectorDebuggableConnection.h: * inspector/remote/RemoteInspectorDebuggableConnection.mm: (Inspector::RemoteInspectorHandleRunSourceGlobal): (Inspector::RemoteInspectorQueueTaskOnGlobalQueue): (Inspector::RemoteInspectorInitializeGlobalQueue): (Inspector::RemoteInspectorHandleRunSourceWithInfo): (Inspector::RemoteInspectorDebuggableConnection::setup): (Inspector::RemoteInspectorDebuggableConnection::closeFromDebuggable): (Inspector::RemoteInspectorDebuggableConnection::close): (Inspector::RemoteInspectorDebuggableConnection::sendMessageToBackend): (Inspector::RemoteInspectorDebuggableConnection::queueTaskOnPrivateRunLoop): * interpreter/JSStack.cpp: (JSC::JSStack::JSStack): (JSC::JSStack::releaseExcessCapacity): (JSC::JSStack::addToCommittedByteCount): (JSC::JSStack::committedByteCount): (JSC::stackStatisticsMutex): Deleted. (JSC::JSStack::initializeThreading): Deleted. * interpreter/JSStack.h: (JSC::JSStack::gatherConservativeRoots): (JSC::JSStack::sanitizeStack): (JSC::JSStack::size): (JSC::JSStack::initializeThreading): Deleted. * jit/ExecutableAllocator.cpp: (JSC::DemandExecutableAllocator::DemandExecutableAllocator): (JSC::DemandExecutableAllocator::~DemandExecutableAllocator): (JSC::DemandExecutableAllocator::bytesAllocatedByAllAllocators): (JSC::DemandExecutableAllocator::bytesCommittedByAllocactors): (JSC::DemandExecutableAllocator::dumpProfileFromAllAllocators): (JSC::DemandExecutableAllocator::allocators): (JSC::DemandExecutableAllocator::allocatorsMutex): * jit/JITThunks.cpp: (JSC::JITThunks::ctiStub): * jit/JITThunks.h: * profiler/ProfilerDatabase.cpp: (JSC::Profiler::Database::ensureBytecodesFor): (JSC::Profiler::Database::notifyDestruction): * profiler/ProfilerDatabase.h: * runtime/InitializeThreading.cpp: (JSC::initializeThreading): * runtime/JSLock.cpp: (JSC::GlobalJSLock::GlobalJSLock): (JSC::GlobalJSLock::~GlobalJSLock): (JSC::JSLockHolder::JSLockHolder): (JSC::GlobalJSLock::initialize): Deleted. * runtime/JSLock.h: Source/WTF: Relanding after fixing a deadlock on Linux. * wtf/Condition.h: "using WTF::Condition". * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): Add tryLock() because it turns out that we use it sometimes. (WTF::LockBase::try_lock): unique_lock needs this. (WTF::LockBase::unlock): * wtf/ParkingLot.cpp: (WTF::ParkingLot::parkConditionally): Work around a Linux C++ bug where wait_until with time_point::max() immediately returns and doesn't flash the lock. Canonical link: https://commits.webkit.org/166166@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188499 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-15 00:14:52 +00:00
}
// Need this version for std::unique_lock.
bool try_lock() WTF_ACQUIRES_LOCK_IF(true)
Use WTF::Lock and WTF::Condition instead of WTF::Mutex, WTF::ThreadCondition, std::mutex, and std::condition_variable https://bugs.webkit.org/show_bug.cgi?id=147999 Reviewed by Geoffrey Garen. Source/JavaScriptCore: * API/JSVirtualMachine.mm: (initWrapperCache): (+[JSVMWrapperCache addWrapper:forJSContextGroupRef:]): (+[JSVMWrapperCache wrapperForJSContextGroupRef:]): (wrapperCacheMutex): Deleted. * bytecode/SamplingTool.cpp: (JSC::SamplingTool::doRun): (JSC::SamplingTool::notifyOfScope): * bytecode/SamplingTool.h: * dfg/DFGThreadData.h: * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::~Worklist): (JSC::DFG::Worklist::isActiveForVM): (JSC::DFG::Worklist::enqueue): (JSC::DFG::Worklist::compilationState): (JSC::DFG::Worklist::waitUntilAllPlansForVMAreReady): (JSC::DFG::Worklist::removeAllReadyPlansForVM): (JSC::DFG::Worklist::completeAllReadyPlansForVM): (JSC::DFG::Worklist::visitWeakReferences): (JSC::DFG::Worklist::removeDeadPlans): (JSC::DFG::Worklist::queueLength): (JSC::DFG::Worklist::dump): (JSC::DFG::Worklist::runThread): * dfg/DFGWorklist.h: * disassembler/Disassembler.cpp: * heap/CopiedSpace.cpp: (JSC::CopiedSpace::doneFillingBlock): (JSC::CopiedSpace::doneCopying): * heap/CopiedSpace.h: * heap/CopiedSpaceInlines.h: (JSC::CopiedSpace::recycleBorrowedBlock): (JSC::CopiedSpace::allocateBlockForCopyingPhase): * heap/GCThread.cpp: (JSC::GCThread::waitForNextPhase): (JSC::GCThread::gcThreadMain): * heap/GCThreadSharedData.cpp: (JSC::GCThreadSharedData::GCThreadSharedData): (JSC::GCThreadSharedData::~GCThreadSharedData): (JSC::GCThreadSharedData::startNextPhase): (JSC::GCThreadSharedData::endCurrentPhase): (JSC::GCThreadSharedData::didStartMarking): (JSC::GCThreadSharedData::didFinishMarking): * heap/GCThreadSharedData.h: * heap/HeapTimer.h: * heap/MachineStackMarker.cpp: (JSC::ActiveMachineThreadsManager::Locker::Locker): (JSC::ActiveMachineThreadsManager::add): (JSC::ActiveMachineThreadsManager::remove): (JSC::ActiveMachineThreadsManager::ActiveMachineThreadsManager): (JSC::MachineThreads::~MachineThreads): (JSC::MachineThreads::addCurrentThread): (JSC::MachineThreads::removeThreadIfFound): (JSC::MachineThreads::tryCopyOtherThreadStack): (JSC::MachineThreads::tryCopyOtherThreadStacks): (JSC::MachineThreads::gatherConservativeRoots): * heap/MachineStackMarker.h: * heap/SlotVisitor.cpp: (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::mergeOpaqueRoots): * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::containsOpaqueRootTriState): * inspector/remote/RemoteInspectorDebuggableConnection.h: * inspector/remote/RemoteInspectorDebuggableConnection.mm: (Inspector::RemoteInspectorHandleRunSourceGlobal): (Inspector::RemoteInspectorQueueTaskOnGlobalQueue): (Inspector::RemoteInspectorInitializeGlobalQueue): (Inspector::RemoteInspectorHandleRunSourceWithInfo): (Inspector::RemoteInspectorDebuggableConnection::setup): (Inspector::RemoteInspectorDebuggableConnection::closeFromDebuggable): (Inspector::RemoteInspectorDebuggableConnection::close): (Inspector::RemoteInspectorDebuggableConnection::sendMessageToBackend): (Inspector::RemoteInspectorDebuggableConnection::queueTaskOnPrivateRunLoop): * interpreter/JSStack.cpp: (JSC::JSStack::JSStack): (JSC::JSStack::releaseExcessCapacity): (JSC::JSStack::addToCommittedByteCount): (JSC::JSStack::committedByteCount): (JSC::stackStatisticsMutex): Deleted. (JSC::JSStack::initializeThreading): Deleted. * interpreter/JSStack.h: (JSC::JSStack::gatherConservativeRoots): (JSC::JSStack::sanitizeStack): (JSC::JSStack::size): (JSC::JSStack::initializeThreading): Deleted. * jit/ExecutableAllocator.cpp: (JSC::DemandExecutableAllocator::DemandExecutableAllocator): (JSC::DemandExecutableAllocator::~DemandExecutableAllocator): (JSC::DemandExecutableAllocator::bytesAllocatedByAllAllocators): (JSC::DemandExecutableAllocator::bytesCommittedByAllocactors): (JSC::DemandExecutableAllocator::dumpProfileFromAllAllocators): (JSC::DemandExecutableAllocator::allocators): (JSC::DemandExecutableAllocator::allocatorsMutex): * jit/JITThunks.cpp: (JSC::JITThunks::ctiStub): * jit/JITThunks.h: * profiler/ProfilerDatabase.cpp: (JSC::Profiler::Database::ensureBytecodesFor): (JSC::Profiler::Database::notifyDestruction): * profiler/ProfilerDatabase.h: * runtime/InitializeThreading.cpp: (JSC::initializeThreading): * runtime/JSLock.cpp: (JSC::GlobalJSLock::GlobalJSLock): (JSC::GlobalJSLock::~GlobalJSLock): (JSC::JSLockHolder::JSLockHolder): (JSC::GlobalJSLock::initialize): Deleted. * runtime/JSLock.h: Source/WTF: Relanding after fixing a deadlock on Linux. * wtf/Condition.h: "using WTF::Condition". * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): Add tryLock() because it turns out that we use it sometimes. (WTF::LockBase::try_lock): unique_lock needs this. (WTF::LockBase::unlock): * wtf/ParkingLot.cpp: (WTF::ParkingLot::parkConditionally): Work around a Linux C++ bug where wait_until with time_point::max() immediately returns and doesn't flash the lock. Canonical link: https://commits.webkit.org/166166@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188499 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-15 00:14:52 +00:00
{
return tryLock();
}
WTF_EXPORT_PRIVATE bool tryLockWithTimeout(Seconds timeout) WTF_ACQUIRES_LOCK_IF(true);
WTF::Lock should be fair eventually https://bugs.webkit.org/show_bug.cgi?id=159384 Reviewed by Geoffrey Garen. Source/WTF: In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of locks makes them fast. That post presented lock fairness as a trade-off between two extremes: - Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a thread on the queue. If there was a thread on the queue, the lock is released and that thread is made runnable. That thread may then grab the lock, or some other thread may grab the lock first (it may barge). Usually, the barging thread is the thread that released the lock in the first place. This maximizes throughput but hurts fairness. There is no good theoretical bound on how unfair the lock may become, but empirical data suggests that it's fair enough for the cases we previously measured. - FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock() if there is a thread waiting. If there is a thread waiting, unlock() will make that thread runnable and inform it that it now holds the lock. This ensures perfect round-robin fairness and allows us to reason theoretically about how long it may take for a thread to grab the lock. For example, if we know that only N threads are running and each one may contend on a critical section, and each one may hold the lock for at most S seconds, then the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly in most cases. This is because for the common case of short critical sections, they force a context switch after each critical section if the lock is contended. This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging. Thanks to this new algorithm, you can now have both of these things at the same time. This change makes WTF::Lock eventually fair. We can almost (more on the caveats below) guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical sections that are longer than 1ms are always fair. For shorter critical sections, the amount of time that any thread waits is 1ms times the number of threads. There are some caveats that arise from our use of randomness, but even then, in the limit as the critical section length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our experiments show that the lock becomes exactly as fair as a FIFO lock for any critical section that is 1ms or longer. The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair. ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each other via a token. unparkOne() can pass a token, which parkConditionally() will return. This change also makes parkConditionally() a lot more precise about when it was unparked due to a call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is guaranteed to report that it was unparked rather than timing out, and that thread is guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing fair unlocking. In that case, unlock() will not release the lock, and lock() will know that it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock can make a decision about unlock strategy and inject a token while it has complete knowledge over the state of the queue. As such, it's not immediately obvious how to implement this algorithm on top of futexes. You really need ParkingLot! WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(), which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms at random. When a dequeue happens and there are threads that actually get dequeued, we check if the time since the last unfair unlock (the last time timeToBeFair was set to true) is more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout. This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to happen at least once per millisecond. It will happen at 2 KHz on average. If there are collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment of everyone else. Randomness ensures that we aren't too fair to any one thread. Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to allow only one thread to hold the lock during a whole second in which each thread is holding the lock for 1ms at a time. This is because in a barging lock, releasing a lock after holding it for 1ms and then reacquiring it immediately virtually ensures that none of the other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock handles this case like a champ: each thread gets equal turns. Here's some data. If we launch 10 threads and have each of them run for 1 second while repeatedly holding a critical section for 1ms, then here's how many times each thread gets to hold the lock using the old WTF::Lock algorithm: 799, 6, 1, 1, 1, 1, 1, 1, 1, 1 One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock becomes totally fair: 80, 79, 79, 79, 79, 79, 79, 80, 80, 79 I don't know of anyone creating such an automatically-fair adaptive lock before, so I think that this is a pretty awesome advancement to the state of the art! This change is good for three reasons: - We do have long critical sections in WebKit and we don't want to have to worry about starvation. This reduces the likelihood that we will see starvation due to our lock strategy. - I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or lockFairly() or some moral equivalent for the scavenger thread. - If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability to unlock and relock without barging. * benchmarks/LockFairnessTest.cpp: (main): * benchmarks/ToyLocks.h: * wtf/Condition.h: (WTF::ConditionBase::waitUntil): (WTF::ConditionBase::notifyOne): * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): * wtf/Lock.h: (WTF::LockBase::try_lock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (WTF::LockBase::isFullyReset): * wtf/ParkingLot.cpp: (WTF::ParkingLot::parkConditionallyImpl): (WTF::ParkingLot::unparkOne): (WTF::ParkingLot::unparkOneImpl): (WTF::ParkingLot::unparkAll): * wtf/ParkingLot.h: (WTF::ParkingLot::parkConditionally): (WTF::ParkingLot::compareAndPark): (WTF::ParkingLot::unparkOne): Tools: * TestWebKitAPI/Tests/WTF/ParkingLot.cpp: Canonical link: https://commits.webkit.org/178039@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
// Relinquish the lock. Either one of the threads that were waiting for the lock, or some other
// thread that happens to be running, will be able to grab the lock. This bit of unfairness is
// called barging, and we allow it because it maximizes throughput. However, we bound how unfair
// barging can get by ensuring that every once in a while, when there is a thread waiting on the
// lock, we hand the lock to that thread directly. Every time unlock() finds a thread waiting,
// we check if the last time that we did a fair unlock was more than roughly 1ms ago; if so, we
// unlock fairly. Fairness matters most for long critical sections, and this virtually
// guarantees that long critical sections always get a fair lock.
void unlock() WTF_RELEASES_LOCK()
Lightweight locks should be adaptive https://bugs.webkit.org/show_bug.cgi?id=147545 Reviewed by Geoffrey Garen. Source/JavaScriptCore: * dfg/DFGCommon.cpp: (JSC::DFG::startCrashing): * heap/CopiedBlock.h: (JSC::CopiedBlock::workListLock): * heap/CopiedBlockInlines.h: (JSC::CopiedBlock::shouldReportLiveBytes): (JSC::CopiedBlock::reportLiveBytes): * heap/CopiedSpace.cpp: (JSC::CopiedSpace::doneFillingBlock): * heap/CopiedSpace.h: (JSC::CopiedSpace::CopiedGeneration::CopiedGeneration): * heap/CopiedSpaceInlines.h: (JSC::CopiedSpace::recycleEvacuatedBlock): * heap/GCThreadSharedData.cpp: (JSC::GCThreadSharedData::didStartCopying): * heap/GCThreadSharedData.h: (JSC::GCThreadSharedData::getNextBlocksToCopy): * heap/ListableHandler.h: (JSC::ListableHandler::List::addThreadSafe): (JSC::ListableHandler::List::addNotThreadSafe): * heap/MachineStackMarker.cpp: (JSC::MachineThreads::tryCopyOtherThreadStacks): * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::copyLater): * parser/SourceProvider.cpp: (JSC::SourceProvider::~SourceProvider): (JSC::SourceProvider::getID): * profiler/ProfilerDatabase.cpp: (JSC::Profiler::Database::addDatabaseToAtExit): (JSC::Profiler::Database::removeDatabaseFromAtExit): (JSC::Profiler::Database::removeFirstAtExitDatabase): * runtime/TypeProfilerLog.h: Source/WebCore: * bindings/objc/WebScriptObject.mm: (WebCore::getJSWrapper): (WebCore::addJSWrapper): (WebCore::removeJSWrapper): (WebCore::removeJSWrapperIfRetainCountOne): * platform/audio/mac/CARingBuffer.cpp: (WebCore::CARingBuffer::setCurrentFrameBounds): (WebCore::CARingBuffer::getCurrentFrameBounds): * platform/audio/mac/CARingBuffer.h: * platform/ios/wak/WAKWindow.mm: (-[WAKWindow setExposedScrollViewRect:]): (-[WAKWindow exposedScrollViewRect]): Source/WebKit2: * WebProcess/WebPage/EventDispatcher.cpp: (WebKit::EventDispatcher::clearQueuedTouchEventsForPage): (WebKit::EventDispatcher::getQueuedTouchEventsForPage): (WebKit::EventDispatcher::touchEvent): (WebKit::EventDispatcher::dispatchTouchEvents): * WebProcess/WebPage/EventDispatcher.h: * WebProcess/WebPage/ViewUpdateDispatcher.cpp: (WebKit::ViewUpdateDispatcher::visibleContentRectUpdate): (WebKit::ViewUpdateDispatcher::dispatchVisibleContentRectUpdate): * WebProcess/WebPage/ViewUpdateDispatcher.h: Source/WTF: A common idiom in WebKit is to use spinlocks. We use them because the lock acquisition overhead is lower than system locks and because they take dramatically less space than system locks. The speed and space advantages of spinlocks can be astonishing: an uncontended spinlock acquire is up to 10x faster and under microcontention - short critical section with two or more threads taking turns - spinlocks are up to 100x faster. Spinlocks take only 1 byte or 4 bytes depending on the flavor, while system locks take 64 bytes or more. Clearly, WebKit should continue to avoid system locks - they are just far too slow and far too big. But there is a problem with this idiom. System lock implementations will sleep a thread when it attempts to acquire a lock that is held, while spinlocks will cause the thread to burn CPU. In WebKit spinlocks, the thread will repeatedly call sched_yield(). This is awesome for microcontention, but awful when the lock will not be released for a while. In fact, when critical sections take tens of microseconds or more, the CPU time cost of our spinlocks is almost 100x more than the CPU time cost of a system lock. This case doesn't arise too frequently in our current uses of spinlocks, but that's probably because right now there are places where we make a conscious decision to use system locks - even though they use more memory and are slower - because we don't want to waste CPU cycles when a thread has to wait a while to acquire the lock. The solution is to just implement a modern adaptive mutex in WTF. Luckily, this isn't a new concept. This patch implements a mutex that is reminiscent of the kinds of low-overhead locks that JVMs use. The actual implementation here is inspired by some of the ideas from [1]. The idea is simple: the fast path is an inlined CAS to immediately acquire a lock that isn't held, the slow path tries some number of spins to acquire the lock, and if that fails, the thread is put on a queue and put to sleep. The queue is made up of statically allocated thread nodes and the lock itself is a tagged pointer: either it is just bits telling us the complete lock state (not held or held) or it is a pointer to the head of a queue of threads waiting to acquire the lock. This approach gives WTF::Lock three different levels of adaptation: an inlined fast path if the lock is not contended, a short burst of spinning for microcontention, and a full-blown queue for critical sections that are held for a long time. On a locking microbenchmark, this new Lock exhibits the following performance characteristics: - Lock+unlock on an uncontended no-op critical section: 2x slower than SpinLock and 3x faster than a system mutex. - Lock+unlock on a contended no-op critical section: 2x slower than SpinLock and 100x faster than a system mutex. - CPU time spent in lock() on a lock held for a while: same as system mutex, 90x less than a SpinLock. - Memory usage: sizeof(void*), so on 64-bit it's 8x less than a system mutex but 2x worse than a SpinLock. This patch replaces all uses of SpinLock with Lock, since our critical sections are not no-ops so if you do basically anything in your critical section, the Lock overhead will be invisible. Also, in all places where we used SpinLock, we could tolerate 8 bytes of overhead instead of 4. Performance benchmarking using JSC macrobenchmarks shows no difference, which is as it should be: the purpose of this change is to reduce CPU time wasted, not wallclock time. This patch doesn't replace any uses of ByteSpinLock, since we expect that the space benefits of having a lock that just uses a byte are still better than the CPU wastage benefits of Lock. But, this work will enable some future work to create locks that will fit in just 1.6 bits: https://bugs.webkit.org/show_bug.cgi?id=147665. Rolling this back in after fixing Lock::unlockSlow() for architectures that have a truly weak CAS. Since the Lock::unlock() fast path can go to slow path spuriously, it may go there even if there aren't any threads on the Lock's queue. So, unlockSlow() must be able to deal with the possibility of a null queue head. [1] http://www.filpizlo.com/papers/pizlo-pppj2011-fable.pdf * WTF.vcxproj/WTF.vcxproj: * WTF.xcodeproj/project.pbxproj: * benchmarks: Added. * benchmarks/LockSpeedTest.cpp: Added. (main): * wtf/Atomics.h: (WTF::Atomic::compareExchangeWeak): (WTF::Atomic::compareExchangeStrong): * wtf/CMakeLists.txt: * wtf/Lock.cpp: Added. (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): * wtf/Lock.h: Added. (WTF::LockBase::lock): (WTF::LockBase::unlock): (WTF::LockBase::isHeld): (WTF::LockBase::isLocked): (WTF::Lock::Lock): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocatorHandle::shrink): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): (WTF::MetaAllocator::addFreshFreeSpace): (WTF::MetaAllocator::debugFreeSpaceSize): * wtf/MetaAllocator.h: * wtf/SpinLock.h: * wtf/ThreadingPthreads.cpp: * wtf/ThreadingWin.cpp: * wtf/text/AtomicString.cpp: * wtf/text/AtomicStringImpl.cpp: (WTF::AtomicStringTableLocker::AtomicStringTableLocker): Tools: * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.vcxproj/TestWebKitAPI.vcxproj: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Lock.cpp: Added. (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/165908@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188169 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-07 22:38:59 +00:00
{
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
if (UNLIKELY(!DefaultLockAlgorithm::unlockFastAssumingZero(m_byte)))
unlockSlow();
Lightweight locks should be adaptive https://bugs.webkit.org/show_bug.cgi?id=147545 Reviewed by Geoffrey Garen. Source/JavaScriptCore: * dfg/DFGCommon.cpp: (JSC::DFG::startCrashing): * heap/CopiedBlock.h: (JSC::CopiedBlock::workListLock): * heap/CopiedBlockInlines.h: (JSC::CopiedBlock::shouldReportLiveBytes): (JSC::CopiedBlock::reportLiveBytes): * heap/CopiedSpace.cpp: (JSC::CopiedSpace::doneFillingBlock): * heap/CopiedSpace.h: (JSC::CopiedSpace::CopiedGeneration::CopiedGeneration): * heap/CopiedSpaceInlines.h: (JSC::CopiedSpace::recycleEvacuatedBlock): * heap/GCThreadSharedData.cpp: (JSC::GCThreadSharedData::didStartCopying): * heap/GCThreadSharedData.h: (JSC::GCThreadSharedData::getNextBlocksToCopy): * heap/ListableHandler.h: (JSC::ListableHandler::List::addThreadSafe): (JSC::ListableHandler::List::addNotThreadSafe): * heap/MachineStackMarker.cpp: (JSC::MachineThreads::tryCopyOtherThreadStacks): * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::copyLater): * parser/SourceProvider.cpp: (JSC::SourceProvider::~SourceProvider): (JSC::SourceProvider::getID): * profiler/ProfilerDatabase.cpp: (JSC::Profiler::Database::addDatabaseToAtExit): (JSC::Profiler::Database::removeDatabaseFromAtExit): (JSC::Profiler::Database::removeFirstAtExitDatabase): * runtime/TypeProfilerLog.h: Source/WebCore: * bindings/objc/WebScriptObject.mm: (WebCore::getJSWrapper): (WebCore::addJSWrapper): (WebCore::removeJSWrapper): (WebCore::removeJSWrapperIfRetainCountOne): * platform/audio/mac/CARingBuffer.cpp: (WebCore::CARingBuffer::setCurrentFrameBounds): (WebCore::CARingBuffer::getCurrentFrameBounds): * platform/audio/mac/CARingBuffer.h: * platform/ios/wak/WAKWindow.mm: (-[WAKWindow setExposedScrollViewRect:]): (-[WAKWindow exposedScrollViewRect]): Source/WebKit2: * WebProcess/WebPage/EventDispatcher.cpp: (WebKit::EventDispatcher::clearQueuedTouchEventsForPage): (WebKit::EventDispatcher::getQueuedTouchEventsForPage): (WebKit::EventDispatcher::touchEvent): (WebKit::EventDispatcher::dispatchTouchEvents): * WebProcess/WebPage/EventDispatcher.h: * WebProcess/WebPage/ViewUpdateDispatcher.cpp: (WebKit::ViewUpdateDispatcher::visibleContentRectUpdate): (WebKit::ViewUpdateDispatcher::dispatchVisibleContentRectUpdate): * WebProcess/WebPage/ViewUpdateDispatcher.h: Source/WTF: A common idiom in WebKit is to use spinlocks. We use them because the lock acquisition overhead is lower than system locks and because they take dramatically less space than system locks. The speed and space advantages of spinlocks can be astonishing: an uncontended spinlock acquire is up to 10x faster and under microcontention - short critical section with two or more threads taking turns - spinlocks are up to 100x faster. Spinlocks take only 1 byte or 4 bytes depending on the flavor, while system locks take 64 bytes or more. Clearly, WebKit should continue to avoid system locks - they are just far too slow and far too big. But there is a problem with this idiom. System lock implementations will sleep a thread when it attempts to acquire a lock that is held, while spinlocks will cause the thread to burn CPU. In WebKit spinlocks, the thread will repeatedly call sched_yield(). This is awesome for microcontention, but awful when the lock will not be released for a while. In fact, when critical sections take tens of microseconds or more, the CPU time cost of our spinlocks is almost 100x more than the CPU time cost of a system lock. This case doesn't arise too frequently in our current uses of spinlocks, but that's probably because right now there are places where we make a conscious decision to use system locks - even though they use more memory and are slower - because we don't want to waste CPU cycles when a thread has to wait a while to acquire the lock. The solution is to just implement a modern adaptive mutex in WTF. Luckily, this isn't a new concept. This patch implements a mutex that is reminiscent of the kinds of low-overhead locks that JVMs use. The actual implementation here is inspired by some of the ideas from [1]. The idea is simple: the fast path is an inlined CAS to immediately acquire a lock that isn't held, the slow path tries some number of spins to acquire the lock, and if that fails, the thread is put on a queue and put to sleep. The queue is made up of statically allocated thread nodes and the lock itself is a tagged pointer: either it is just bits telling us the complete lock state (not held or held) or it is a pointer to the head of a queue of threads waiting to acquire the lock. This approach gives WTF::Lock three different levels of adaptation: an inlined fast path if the lock is not contended, a short burst of spinning for microcontention, and a full-blown queue for critical sections that are held for a long time. On a locking microbenchmark, this new Lock exhibits the following performance characteristics: - Lock+unlock on an uncontended no-op critical section: 2x slower than SpinLock and 3x faster than a system mutex. - Lock+unlock on a contended no-op critical section: 2x slower than SpinLock and 100x faster than a system mutex. - CPU time spent in lock() on a lock held for a while: same as system mutex, 90x less than a SpinLock. - Memory usage: sizeof(void*), so on 64-bit it's 8x less than a system mutex but 2x worse than a SpinLock. This patch replaces all uses of SpinLock with Lock, since our critical sections are not no-ops so if you do basically anything in your critical section, the Lock overhead will be invisible. Also, in all places where we used SpinLock, we could tolerate 8 bytes of overhead instead of 4. Performance benchmarking using JSC macrobenchmarks shows no difference, which is as it should be: the purpose of this change is to reduce CPU time wasted, not wallclock time. This patch doesn't replace any uses of ByteSpinLock, since we expect that the space benefits of having a lock that just uses a byte are still better than the CPU wastage benefits of Lock. But, this work will enable some future work to create locks that will fit in just 1.6 bits: https://bugs.webkit.org/show_bug.cgi?id=147665. Rolling this back in after fixing Lock::unlockSlow() for architectures that have a truly weak CAS. Since the Lock::unlock() fast path can go to slow path spuriously, it may go there even if there aren't any threads on the Lock's queue. So, unlockSlow() must be able to deal with the possibility of a null queue head. [1] http://www.filpizlo.com/papers/pizlo-pppj2011-fable.pdf * WTF.vcxproj/WTF.vcxproj: * WTF.xcodeproj/project.pbxproj: * benchmarks: Added. * benchmarks/LockSpeedTest.cpp: Added. (main): * wtf/Atomics.h: (WTF::Atomic::compareExchangeWeak): (WTF::Atomic::compareExchangeStrong): * wtf/CMakeLists.txt: * wtf/Lock.cpp: Added. (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): * wtf/Lock.h: Added. (WTF::LockBase::lock): (WTF::LockBase::unlock): (WTF::LockBase::isHeld): (WTF::LockBase::isLocked): (WTF::Lock::Lock): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocatorHandle::shrink): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): (WTF::MetaAllocator::addFreshFreeSpace): (WTF::MetaAllocator::debugFreeSpaceSize): * wtf/MetaAllocator.h: * wtf/SpinLock.h: * wtf/ThreadingPthreads.cpp: * wtf/ThreadingWin.cpp: * wtf/text/AtomicString.cpp: * wtf/text/AtomicStringImpl.cpp: (WTF::AtomicStringTableLocker::AtomicStringTableLocker): Tools: * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.vcxproj/TestWebKitAPI.vcxproj: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Lock.cpp: Added. (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/165908@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188169 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-07 22:38:59 +00:00
}
WTF::Lock should be fair eventually https://bugs.webkit.org/show_bug.cgi?id=159384 Reviewed by Geoffrey Garen. Source/WTF: In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of locks makes them fast. That post presented lock fairness as a trade-off between two extremes: - Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a thread on the queue. If there was a thread on the queue, the lock is released and that thread is made runnable. That thread may then grab the lock, or some other thread may grab the lock first (it may barge). Usually, the barging thread is the thread that released the lock in the first place. This maximizes throughput but hurts fairness. There is no good theoretical bound on how unfair the lock may become, but empirical data suggests that it's fair enough for the cases we previously measured. - FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock() if there is a thread waiting. If there is a thread waiting, unlock() will make that thread runnable and inform it that it now holds the lock. This ensures perfect round-robin fairness and allows us to reason theoretically about how long it may take for a thread to grab the lock. For example, if we know that only N threads are running and each one may contend on a critical section, and each one may hold the lock for at most S seconds, then the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly in most cases. This is because for the common case of short critical sections, they force a context switch after each critical section if the lock is contended. This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging. Thanks to this new algorithm, you can now have both of these things at the same time. This change makes WTF::Lock eventually fair. We can almost (more on the caveats below) guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical sections that are longer than 1ms are always fair. For shorter critical sections, the amount of time that any thread waits is 1ms times the number of threads. There are some caveats that arise from our use of randomness, but even then, in the limit as the critical section length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our experiments show that the lock becomes exactly as fair as a FIFO lock for any critical section that is 1ms or longer. The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair. ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each other via a token. unparkOne() can pass a token, which parkConditionally() will return. This change also makes parkConditionally() a lot more precise about when it was unparked due to a call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is guaranteed to report that it was unparked rather than timing out, and that thread is guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing fair unlocking. In that case, unlock() will not release the lock, and lock() will know that it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock can make a decision about unlock strategy and inject a token while it has complete knowledge over the state of the queue. As such, it's not immediately obvious how to implement this algorithm on top of futexes. You really need ParkingLot! WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(), which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms at random. When a dequeue happens and there are threads that actually get dequeued, we check if the time since the last unfair unlock (the last time timeToBeFair was set to true) is more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout. This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to happen at least once per millisecond. It will happen at 2 KHz on average. If there are collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment of everyone else. Randomness ensures that we aren't too fair to any one thread. Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to allow only one thread to hold the lock during a whole second in which each thread is holding the lock for 1ms at a time. This is because in a barging lock, releasing a lock after holding it for 1ms and then reacquiring it immediately virtually ensures that none of the other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock handles this case like a champ: each thread gets equal turns. Here's some data. If we launch 10 threads and have each of them run for 1 second while repeatedly holding a critical section for 1ms, then here's how many times each thread gets to hold the lock using the old WTF::Lock algorithm: 799, 6, 1, 1, 1, 1, 1, 1, 1, 1 One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock becomes totally fair: 80, 79, 79, 79, 79, 79, 79, 80, 80, 79 I don't know of anyone creating such an automatically-fair adaptive lock before, so I think that this is a pretty awesome advancement to the state of the art! This change is good for three reasons: - We do have long critical sections in WebKit and we don't want to have to worry about starvation. This reduces the likelihood that we will see starvation due to our lock strategy. - I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or lockFairly() or some moral equivalent for the scavenger thread. - If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability to unlock and relock without barging. * benchmarks/LockFairnessTest.cpp: (main): * benchmarks/ToyLocks.h: * wtf/Condition.h: (WTF::ConditionBase::waitUntil): (WTF::ConditionBase::notifyOne): * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): * wtf/Lock.h: (WTF::LockBase::try_lock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (WTF::LockBase::isFullyReset): * wtf/ParkingLot.cpp: (WTF::ParkingLot::parkConditionallyImpl): (WTF::ParkingLot::unparkOne): (WTF::ParkingLot::unparkOneImpl): (WTF::ParkingLot::unparkAll): * wtf/ParkingLot.h: (WTF::ParkingLot::parkConditionally): (WTF::ParkingLot::compareAndPark): (WTF::ParkingLot::unparkOne): Tools: * TestWebKitAPI/Tests/WTF/ParkingLot.cpp: Canonical link: https://commits.webkit.org/178039@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
// This is like unlock() but it guarantees that we unlock the lock fairly. For short critical
// sections, this is much slower than unlock(). For long critical sections, unlock() will learn
// to be fair anyway. However, if you plan to relock the lock right after unlocking and you want
// to ensure that some other thread runs in the meantime, this is probably the function you
// want.
void unlockFairly() WTF_RELEASES_LOCK()
WTF::Lock should be fair eventually https://bugs.webkit.org/show_bug.cgi?id=159384 Reviewed by Geoffrey Garen. Source/WTF: In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of locks makes them fast. That post presented lock fairness as a trade-off between two extremes: - Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a thread on the queue. If there was a thread on the queue, the lock is released and that thread is made runnable. That thread may then grab the lock, or some other thread may grab the lock first (it may barge). Usually, the barging thread is the thread that released the lock in the first place. This maximizes throughput but hurts fairness. There is no good theoretical bound on how unfair the lock may become, but empirical data suggests that it's fair enough for the cases we previously measured. - FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock() if there is a thread waiting. If there is a thread waiting, unlock() will make that thread runnable and inform it that it now holds the lock. This ensures perfect round-robin fairness and allows us to reason theoretically about how long it may take for a thread to grab the lock. For example, if we know that only N threads are running and each one may contend on a critical section, and each one may hold the lock for at most S seconds, then the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly in most cases. This is because for the common case of short critical sections, they force a context switch after each critical section if the lock is contended. This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging. Thanks to this new algorithm, you can now have both of these things at the same time. This change makes WTF::Lock eventually fair. We can almost (more on the caveats below) guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical sections that are longer than 1ms are always fair. For shorter critical sections, the amount of time that any thread waits is 1ms times the number of threads. There are some caveats that arise from our use of randomness, but even then, in the limit as the critical section length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our experiments show that the lock becomes exactly as fair as a FIFO lock for any critical section that is 1ms or longer. The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair. ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each other via a token. unparkOne() can pass a token, which parkConditionally() will return. This change also makes parkConditionally() a lot more precise about when it was unparked due to a call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is guaranteed to report that it was unparked rather than timing out, and that thread is guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing fair unlocking. In that case, unlock() will not release the lock, and lock() will know that it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock can make a decision about unlock strategy and inject a token while it has complete knowledge over the state of the queue. As such, it's not immediately obvious how to implement this algorithm on top of futexes. You really need ParkingLot! WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(), which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms at random. When a dequeue happens and there are threads that actually get dequeued, we check if the time since the last unfair unlock (the last time timeToBeFair was set to true) is more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout. This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to happen at least once per millisecond. It will happen at 2 KHz on average. If there are collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment of everyone else. Randomness ensures that we aren't too fair to any one thread. Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to allow only one thread to hold the lock during a whole second in which each thread is holding the lock for 1ms at a time. This is because in a barging lock, releasing a lock after holding it for 1ms and then reacquiring it immediately virtually ensures that none of the other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock handles this case like a champ: each thread gets equal turns. Here's some data. If we launch 10 threads and have each of them run for 1 second while repeatedly holding a critical section for 1ms, then here's how many times each thread gets to hold the lock using the old WTF::Lock algorithm: 799, 6, 1, 1, 1, 1, 1, 1, 1, 1 One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock becomes totally fair: 80, 79, 79, 79, 79, 79, 79, 80, 80, 79 I don't know of anyone creating such an automatically-fair adaptive lock before, so I think that this is a pretty awesome advancement to the state of the art! This change is good for three reasons: - We do have long critical sections in WebKit and we don't want to have to worry about starvation. This reduces the likelihood that we will see starvation due to our lock strategy. - I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or lockFairly() or some moral equivalent for the scavenger thread. - If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability to unlock and relock without barging. * benchmarks/LockFairnessTest.cpp: (main): * benchmarks/ToyLocks.h: * wtf/Condition.h: (WTF::ConditionBase::waitUntil): (WTF::ConditionBase::notifyOne): * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): * wtf/Lock.h: (WTF::LockBase::try_lock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (WTF::LockBase::isFullyReset): * wtf/ParkingLot.cpp: (WTF::ParkingLot::parkConditionallyImpl): (WTF::ParkingLot::unparkOne): (WTF::ParkingLot::unparkOneImpl): (WTF::ParkingLot::unparkAll): * wtf/ParkingLot.h: (WTF::ParkingLot::parkConditionally): (WTF::ParkingLot::compareAndPark): (WTF::ParkingLot::unparkOne): Tools: * TestWebKitAPI/Tests/WTF/ParkingLot.cpp: Canonical link: https://commits.webkit.org/178039@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
{
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
if (UNLIKELY(!DefaultLockAlgorithm::unlockFastAssumingZero(m_byte)))
unlockFairlySlow();
WTF::Lock should be fair eventually https://bugs.webkit.org/show_bug.cgi?id=159384 Reviewed by Geoffrey Garen. Source/WTF: In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of locks makes them fast. That post presented lock fairness as a trade-off between two extremes: - Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a thread on the queue. If there was a thread on the queue, the lock is released and that thread is made runnable. That thread may then grab the lock, or some other thread may grab the lock first (it may barge). Usually, the barging thread is the thread that released the lock in the first place. This maximizes throughput but hurts fairness. There is no good theoretical bound on how unfair the lock may become, but empirical data suggests that it's fair enough for the cases we previously measured. - FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock() if there is a thread waiting. If there is a thread waiting, unlock() will make that thread runnable and inform it that it now holds the lock. This ensures perfect round-robin fairness and allows us to reason theoretically about how long it may take for a thread to grab the lock. For example, if we know that only N threads are running and each one may contend on a critical section, and each one may hold the lock for at most S seconds, then the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly in most cases. This is because for the common case of short critical sections, they force a context switch after each critical section if the lock is contended. This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging. Thanks to this new algorithm, you can now have both of these things at the same time. This change makes WTF::Lock eventually fair. We can almost (more on the caveats below) guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical sections that are longer than 1ms are always fair. For shorter critical sections, the amount of time that any thread waits is 1ms times the number of threads. There are some caveats that arise from our use of randomness, but even then, in the limit as the critical section length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our experiments show that the lock becomes exactly as fair as a FIFO lock for any critical section that is 1ms or longer. The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair. ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each other via a token. unparkOne() can pass a token, which parkConditionally() will return. This change also makes parkConditionally() a lot more precise about when it was unparked due to a call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is guaranteed to report that it was unparked rather than timing out, and that thread is guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing fair unlocking. In that case, unlock() will not release the lock, and lock() will know that it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock can make a decision about unlock strategy and inject a token while it has complete knowledge over the state of the queue. As such, it's not immediately obvious how to implement this algorithm on top of futexes. You really need ParkingLot! WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(), which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms at random. When a dequeue happens and there are threads that actually get dequeued, we check if the time since the last unfair unlock (the last time timeToBeFair was set to true) is more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout. This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to happen at least once per millisecond. It will happen at 2 KHz on average. If there are collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment of everyone else. Randomness ensures that we aren't too fair to any one thread. Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to allow only one thread to hold the lock during a whole second in which each thread is holding the lock for 1ms at a time. This is because in a barging lock, releasing a lock after holding it for 1ms and then reacquiring it immediately virtually ensures that none of the other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock handles this case like a champ: each thread gets equal turns. Here's some data. If we launch 10 threads and have each of them run for 1 second while repeatedly holding a critical section for 1ms, then here's how many times each thread gets to hold the lock using the old WTF::Lock algorithm: 799, 6, 1, 1, 1, 1, 1, 1, 1, 1 One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock becomes totally fair: 80, 79, 79, 79, 79, 79, 79, 80, 80, 79 I don't know of anyone creating such an automatically-fair adaptive lock before, so I think that this is a pretty awesome advancement to the state of the art! This change is good for three reasons: - We do have long critical sections in WebKit and we don't want to have to worry about starvation. This reduces the likelihood that we will see starvation due to our lock strategy. - I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or lockFairly() or some moral equivalent for the scavenger thread. - If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability to unlock and relock without barging. * benchmarks/LockFairnessTest.cpp: (main): * benchmarks/ToyLocks.h: * wtf/Condition.h: (WTF::ConditionBase::waitUntil): (WTF::ConditionBase::notifyOne): * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): * wtf/Lock.h: (WTF::LockBase::try_lock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (WTF::LockBase::isFullyReset): * wtf/ParkingLot.cpp: (WTF::ParkingLot::parkConditionallyImpl): (WTF::ParkingLot::unparkOne): (WTF::ParkingLot::unparkOneImpl): (WTF::ParkingLot::unparkAll): * wtf/ParkingLot.h: (WTF::ParkingLot::parkConditionally): (WTF::ParkingLot::compareAndPark): (WTF::ParkingLot::unparkOne): Tools: * TestWebKitAPI/Tests/WTF/ParkingLot.cpp: Canonical link: https://commits.webkit.org/178039@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
}
PerformanceTests: Concurrent GC should be stable enough to land enabled https://bugs.webkit.org/show_bug.cgi?id=164990 Reviewed by Geoffrey Garen. Made CDjs more configurable and refined the "large.js" configuration. I was using that one and the new "long.js" configuration to tune concurrent eden GCs. Added a new way of running Splay in browser, which using chartjs to plot the execution times of 2000 iterations. This includes the minified chartjs. * JetStream/Octane2/splay-detail.html: Added. * JetStream/cdjs/benchmark.js: (benchmarkImpl): (benchmark): * JetStream/cdjs/long.js: Added. Source/JavaScriptCore: Concurrent GC should be stable enough to land enabled on X86_64 https://bugs.webkit.org/show_bug.cgi?id=164990 Reviewed by Geoffrey Garen. This fixes a ton of performance and correctness bugs revealed by getting the concurrent GC to be stable enough to land enabled. I had to redo the JSObject::visitChildren concurrency protocol again. This time I think it's even more correct than ever! This is an enormous win on JetStream/splay-latency and Octane/SplayLatency. It looks to be mostly neutral on everything else, though Speedometer is showing statistically weak signs of a slight regression. * API/JSAPIWrapperObject.mm: Added locking. (JSC::JSAPIWrapperObject::visitChildren): * API/JSCallbackObject.h: Added locking. (JSC::JSCallbackObjectData::visitChildren): (JSC::JSCallbackObjectData::JSPrivatePropertyMap::setPrivateProperty): (JSC::JSCallbackObjectData::JSPrivatePropertyMap::deletePrivateProperty): (JSC::JSCallbackObjectData::JSPrivatePropertyMap::visitChildren): * CMakeLists.txt: * JavaScriptCore.xcodeproj/project.pbxproj: * bytecode/CodeBlock.cpp: (JSC::CodeBlock::UnconditionalFinalizer::finalizeUnconditionally): This had a TOCTOU race on shouldJettisonDueToOldAge. (JSC::EvalCodeCache::visitAggregate): Moved to EvalCodeCache.cpp. * bytecode/DirectEvalCodeCache.cpp: Added. Outlined some functions and made them use locks. (JSC::DirectEvalCodeCache::setSlow): (JSC::DirectEvalCodeCache::clear): (JSC::DirectEvalCodeCache::visitAggregate): * bytecode/DirectEvalCodeCache.h: (JSC::DirectEvalCodeCache::set): (JSC::DirectEvalCodeCache::clear): Deleted. * bytecode/UnlinkedCodeBlock.cpp: Added locking. (JSC::UnlinkedCodeBlock::visitChildren): (JSC::UnlinkedCodeBlock::setInstructions): (JSC::UnlinkedCodeBlock::shrinkToFit): * bytecode/UnlinkedCodeBlock.h: Added locking. (JSC::UnlinkedCodeBlock::addRegExp): (JSC::UnlinkedCodeBlock::addConstant): (JSC::UnlinkedCodeBlock::addFunctionDecl): (JSC::UnlinkedCodeBlock::addFunctionExpr): (JSC::UnlinkedCodeBlock::createRareDataIfNecessary): (JSC::UnlinkedCodeBlock::shrinkToFit): Deleted. * debugger/Debugger.cpp: Use the right delete API. (JSC::Debugger::recompileAllJSFunctions): * dfg/DFGAbstractInterpreterInlines.h: (JSC::DFG::AbstractInterpreter<AbstractStateType>::executeEffects): Fix a pre-existing bug in ToFunction constant folding. * dfg/DFGClobberize.h: Add support for nuking. (JSC::DFG::clobberize): * dfg/DFGClobbersExitState.cpp: Add support for nuking. (JSC::DFG::clobbersExitState): * dfg/DFGFixupPhase.cpp: Add support for nuking. (JSC::DFG::FixupPhase::fixupNode): (JSC::DFG::FixupPhase::indexForChecks): (JSC::DFG::FixupPhase::originForCheck): (JSC::DFG::FixupPhase::speculateForBarrier): (JSC::DFG::FixupPhase::insertCheck): (JSC::DFG::FixupPhase::fixupChecksInBlock): * dfg/DFGSpeculativeJIT.cpp: Add support for nuking. (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): * ftl/FTLLowerDFGToB3.cpp: Add support for nuking. (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::nukeStructureAndSetButterfly): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): Deleted. * heap/CodeBlockSet.cpp: We need to be more careful about the CodeBlockSet workflow during GC, since we will allocate CodeBlocks in eden while collecting. (JSC::CodeBlockSet::clearMarksForFullCollection): (JSC::CodeBlockSet::deleteUnmarkedAndUnreferenced): * heap/Heap.cpp: Added code to measure max pauses. Added a better collectContinuously mode. (JSC::Heap::lastChanceToFinalize): Stop the collectContinuously thread. (JSC::Heap::harvestWeakReferences): Inline SlotVisitor::harvestWeakReferences. (JSC::Heap::finalizeUnconditionalFinalizers): Inline SlotVisitor::finalizeUnconditionalReferences. (JSC::Heap::markToFixpoint): We need to do some MarkedSpace stuff before every conservative scan, rather than just at the start of marking, so we now call prepareForConservativeScan() before each conservative scan. Also call a less-parallel version of drainInParallel when the mutator is running. (JSC::Heap::collectInThread): Inline Heap::prepareForAllocation(). (JSC::Heap::stopIfNecessarySlow): We need to be more careful about ensuring that we run finalization before and after stopping. Also, we should sanitize stack when stopping the world. (JSC::Heap::acquireAccessSlow): Add some optional debug prints. (JSC::Heap::handleNeedFinalize): Assert that we are running this when the world is not stopped. (JSC::Heap::finalize): Remove the old collectContinuously code. (JSC::Heap::requestCollection): We don't need to sanitize stack here anymore. (JSC::Heap::notifyIsSafeToCollect): Start the collectContinuously thread. It will request collection 1 KHz. (JSC::Heap::prepareForAllocation): Deleted. (JSC::Heap::preventCollection): Prevent any new concurrent GCs from being initiated. (JSC::Heap::allowCollection): (JSC::Heap::forEachSlotVisitor): Allows us to safely iterate slot visitors. * heap/Heap.h: * heap/HeapInlines.h: (JSC::Heap::writeBarrier): If the 'to' cell is not NewWhite then it could be AnthraciteOrBlack. During a full collection, objects may be AnthraciteOrBlack from a previous GC. Turns out, we don't benefit from this optimization so we can just kill it. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::buildSnapshot): This needs to use PreventCollectionScope to ensure snapshot soundness. * heap/ListableHandler.h: (JSC::ListableHandler::isOnList): Useful helper. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): It's a locker that only locks while we're marking. * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::addBlock): Hold the bitvector lock while resizing. * heap/MarkedBlock.cpp: Hold the bitvector lock while accessing the bitvectors while the mutator is running. * heap/MarkedSpace.cpp: (JSC::MarkedSpace::prepareForConservativeScan): We used to do this in prepareForMarking, but we need to do it before each conservative scan not just before marking. (JSC::MarkedSpace::prepareForMarking): Remove the logic moved to prepareForConservativeScan. * heap/MarkedSpace.h: * heap/PreventCollectionScope.h: Added. * heap/SlotVisitor.cpp: Refactored drainFromShared so that we can write a similar function called drainInParallelPassively. (JSC::SlotVisitor::updateMutatorIsStopped): Update whether we can use "fast" scanning. (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drain): This now uses the rightToRun lock to allow the main GC thread to safepoint the workers. (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): This runs marking with one fewer threads than normal. It's useful for when we have resumed the mutator, since then the mutator has a better chance of getting on a core. (JSC::SlotVisitor::addWeakReferenceHarvester): (JSC::SlotVisitor::addUnconditionalFinalizer): (JSC::SlotVisitor::harvestWeakReferences): Deleted. (JSC::SlotVisitor::finalizeUnconditionalFinalizers): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: Outline stuff. (JSC::SlotVisitor::addWeakReferenceHarvester): Deleted. (JSC::SlotVisitor::addUnconditionalFinalizer): Deleted. * runtime/InferredType.cpp: This needed thread safety. (JSC::InferredType::visitChildren): This needs to keep its structure finalizer alive until it runs. (JSC::InferredType::set): (JSC::InferredType::InferredStructureFinalizer::finalizeUnconditionally): * runtime/InferredType.h: * runtime/InferredValue.cpp: This needed thread safety. (JSC::InferredValue::visitChildren): (JSC::InferredValue::ValueCleanup::finalizeUnconditionally): * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): Update to use new butterfly API. (JSC::JSArray::unshiftCountWithArrayStorage): Update to use new butterfly API. * runtime/JSArrayBufferView.cpp: (JSC::JSArrayBufferView::visitChildren): Thread safety. * runtime/JSCell.h: (JSC::JSCell::setStructureIDDirectly): This is used for nuking the structure. (JSC::JSCell::InternalLocker::InternalLocker): Deleted. The cell is now the lock. (JSC::JSCell::InternalLocker::~InternalLocker): Deleted. The cell is now the lock. * runtime/JSCellInlines.h: (JSC::JSCell::structure): Clean this up. (JSC::JSCell::lock): The cell is now the lock. (JSC::JSCell::tryLock): (JSC::JSCell::unlock): (JSC::JSCell::isLocked): (JSC::JSCell::lockInternalLock): Deleted. (JSC::JSCell::unlockInternalLock): Deleted. * runtime/JSFunction.cpp: (JSC::JSFunction::visitChildren): Thread safety. * runtime/JSGenericTypedArrayViewInlines.h: (JSC::JSGenericTypedArrayView<Adaptor>::visitChildren): Thread safety. (JSC::JSGenericTypedArrayView<Adaptor>::slowDownAndWasteMemory): Thread safety. * runtime/JSObject.cpp: (JSC::JSObject::markAuxiliaryAndVisitOutOfLineProperties): Factor out this "easy" step of butterfly visiting. (JSC::JSObject::visitButterfly): Make this achieve 100% precision about structure-butterfly relationships. This relies on the mutator "nuking" the structure prior to "locked" structure-butterfly transitions. (JSC::JSObject::visitChildren): Use the new, nicer API. (JSC::JSFinalObject::visitChildren): Use the new, nicer API. (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): Use the new butterfly API. (JSC::JSObject::createInitialUndecided): Use the new butterfly API. (JSC::JSObject::createInitialInt32): Use the new butterfly API. (JSC::JSObject::createInitialDouble): Use the new butterfly API. (JSC::JSObject::createInitialContiguous): Use the new butterfly API. (JSC::JSObject::createArrayStorage): Use the new butterfly API. (JSC::JSObject::convertUndecidedToContiguous): Use the new butterfly API. (JSC::JSObject::convertUndecidedToArrayStorage): Use the new butterfly API. (JSC::JSObject::convertInt32ToArrayStorage): Use the new butterfly API. (JSC::JSObject::convertDoubleToContiguous): Use the new butterfly API. (JSC::JSObject::convertDoubleToArrayStorage): Use the new butterfly API. (JSC::JSObject::convertContiguousToArrayStorage): Use the new butterfly API. (JSC::JSObject::increaseVectorLength): Use the new butterfly API. (JSC::JSObject::shiftButterflyAfterFlattening): Use the new butterfly API. * runtime/JSObject.h: (JSC::JSObject::setButterfly): This now does all of the fences. Only use this when you are not also transitioning the structure or the structure's lastOffset. (JSC::JSObject::nukeStructureAndSetButterfly): Use this when doing locked structure-butterfly transitions. * runtime/JSObjectInlines.h: (JSC::JSObject::putDirectWithoutTransition): Use the newly factored out API. (JSC::JSObject::prepareToPutDirectWithoutTransition): Factor this out! (JSC::JSObject::putDirectInternal): Use the newly factored out API. * runtime/JSPropertyNameEnumerator.cpp: (JSC::JSPropertyNameEnumerator::finishCreation): Locks! (JSC::JSPropertyNameEnumerator::visitChildren): Locks! * runtime/JSSegmentedVariableObject.cpp: (JSC::JSSegmentedVariableObject::visitChildren): Locks! * runtime/JSString.cpp: (JSC::JSString::visitChildren): Thread safety. * runtime/ModuleProgramExecutable.cpp: (JSC::ModuleProgramExecutable::visitChildren): Thread safety. * runtime/Options.cpp: For now we disable concurrent GC on not-X86_64. (JSC::recomputeDependentOptions): * runtime/Options.h: Change the default max GC parallelism to 8. I don't know why it was still 7. * runtime/SamplingProfiler.cpp: (JSC::SamplingProfiler::stackTracesAsJSON): This needs to defer GC before grabbing its lock. * runtime/SparseArrayValueMap.cpp: This needed thread safety. (JSC::SparseArrayValueMap::add): (JSC::SparseArrayValueMap::remove): (JSC::SparseArrayValueMap::visitChildren): * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: This had a race between addNewPropertyTransition and visitChildren. (JSC::Structure::Structure): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::add): Help out with nuking support - the m_offset needs to play along. (JSC::Structure::visitChildren): * runtime/Structure.h: Make some useful things public - like the notion of a lastOffset. * runtime/StructureChain.cpp: (JSC::StructureChain::visitChildren): Thread safety! * runtime/StructureChain.h: Thread safety! * runtime/StructureIDTable.cpp: (JSC::StructureIDTable::allocateID): Ensure that we don't get nuked IDs. * runtime/StructureIDTable.h: Add the notion of a nuked ID! It's a bit that the runtime never sees except during specific shady actions like locked structure-butterfly transitions. "Nuking" tells the GC to steer clear and rescan once we fire the barrier. (JSC::nukedStructureIDBit): (JSC::nuke): (JSC::isNuked): (JSC::decontaminate): * runtime/StructureInlines.h: (JSC::Structure::hasIndexingHeader): Better API. (JSC::Structure::add): * runtime/VM.cpp: Better GC interaction. (JSC::VM::ensureWatchdog): (JSC::VM::deleteAllLinkedCode): (JSC::VM::deleteAllCode): * runtime/VM.h: (JSC::VM::getStructure): Why wasn't this always an API! * runtime/WebAssemblyExecutable.cpp: (JSC::WebAssemblyExecutable::visitChildren): Thread safety. Source/WebCore: Concurrent GC should be stable enough to land enabled on X86_64 https://bugs.webkit.org/show_bug.cgi?id=164990 Reviewed by Geoffrey Garen. Made WebCore down with concurrent marking by adding some locking and adapting to some new API. This has new test modes in run-sjc-stress-tests. Also, the way that LayoutTests run is already a fantastic GC test. * ForwardingHeaders/heap/DeleteAllCodeEffort.h: Added. * ForwardingHeaders/heap/LockDuringMarking.h: Added. * bindings/js/GCController.cpp: (WebCore::GCController::deleteAllCode): (WebCore::GCController::deleteAllLinkedCode): * bindings/js/GCController.h: * bindings/js/JSDOMBinding.cpp: (WebCore::getCachedDOMStructure): (WebCore::cacheDOMStructure): * bindings/js/JSDOMGlobalObject.cpp: (WebCore::JSDOMGlobalObject::addBuiltinGlobals): (WebCore::JSDOMGlobalObject::visitChildren): * bindings/js/JSDOMGlobalObject.h: (WebCore::getDOMConstructor): * bindings/js/JSDOMPromise.cpp: (WebCore::DeferredPromise::DeferredPromise): (WebCore::DeferredPromise::clear): * bindings/js/JSXPathResultCustom.cpp: (WebCore::JSXPathResult::visitAdditionalChildren): * dom/EventListenerMap.cpp: (WebCore::EventListenerMap::clear): (WebCore::EventListenerMap::replace): (WebCore::EventListenerMap::add): (WebCore::EventListenerMap::remove): (WebCore::EventListenerMap::find): (WebCore::EventListenerMap::removeFirstEventListenerCreatedFromMarkup): (WebCore::EventListenerMap::copyEventListenersNotCreatedFromMarkupToTarget): (WebCore::EventListenerIterator::EventListenerIterator): * dom/EventListenerMap.h: (WebCore::EventListenerMap::lock): * dom/EventTarget.cpp: (WebCore::EventTarget::visitJSEventListeners): * dom/EventTarget.h: (WebCore::EventTarget::visitJSEventListeners): Deleted. * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): * dom/Node.h: * page/MemoryRelease.cpp: (WebCore::releaseCriticalMemory): * page/cocoa/MemoryReleaseCocoa.mm: (WebCore::jettisonExpensiveObjectsOnTopLevelNavigation): (WebCore::registerMemoryReleaseNotifyCallbacks): Source/WTF: Concurrent GC should be stable enough to land enabled on X86_64 https://bugs.webkit.org/show_bug.cgi?id=164990 Reviewed by Geoffrey Garen. Adds the ability to say: auto locker = holdLock(any type of lock) Instead of having to say: Locker<LockType> locker(locks of type LockType) I think that we should use "auto locker = holdLock(lock)" as the default way that we acquire locks unless we need to use a special locker type. This also adds the ability to safepoint a lock. Safepointing a lock is basically a super fast way of unlocking it fairly and then immediately relocking it - i.e. letting anyone who is waiting to run without losing steam of there is noone waiting. * wtf/Lock.cpp: (WTF::LockBase::safepointSlow): * wtf/Lock.h: (WTF::LockBase::safepoint): * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::safepointFast): (WTF::LockAlgorithm::safepoint): (WTF::LockAlgorithm::safepointSlow): * wtf/Locker.h: (WTF::AbstractLocker::AbstractLocker): (WTF::Locker::tryLock): (WTF::Locker::operator bool): (WTF::Locker::Locker): (WTF::Locker::operator=): (WTF::holdLock): (WTF::tryHoldLock): Tools: Concurrent GC should be stable enough to land enabled https://bugs.webkit.org/show_bug.cgi?id=164990 Reviewed by Geoffrey Garen. Add a new mode that runs GC continuously. Also made eager modes run GC continuously. It's clear that this works just fine in release, but I'm still trying to figure out if it's safe for debug. It might be too slow for debug. * Scripts/run-jsc-stress-tests: Canonical link: https://commits.webkit.org/183229@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@209570 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-12-08 22:14:50 +00:00
void safepoint()
{
if (UNLIKELY(!DefaultLockAlgorithm::safepointFast(m_byte)))
safepointSlow();
}
WTF::Lock should be fair eventually https://bugs.webkit.org/show_bug.cgi?id=159384 Reviewed by Geoffrey Garen. Source/WTF: In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of locks makes them fast. That post presented lock fairness as a trade-off between two extremes: - Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a thread on the queue. If there was a thread on the queue, the lock is released and that thread is made runnable. That thread may then grab the lock, or some other thread may grab the lock first (it may barge). Usually, the barging thread is the thread that released the lock in the first place. This maximizes throughput but hurts fairness. There is no good theoretical bound on how unfair the lock may become, but empirical data suggests that it's fair enough for the cases we previously measured. - FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock() if there is a thread waiting. If there is a thread waiting, unlock() will make that thread runnable and inform it that it now holds the lock. This ensures perfect round-robin fairness and allows us to reason theoretically about how long it may take for a thread to grab the lock. For example, if we know that only N threads are running and each one may contend on a critical section, and each one may hold the lock for at most S seconds, then the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly in most cases. This is because for the common case of short critical sections, they force a context switch after each critical section if the lock is contended. This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging. Thanks to this new algorithm, you can now have both of these things at the same time. This change makes WTF::Lock eventually fair. We can almost (more on the caveats below) guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical sections that are longer than 1ms are always fair. For shorter critical sections, the amount of time that any thread waits is 1ms times the number of threads. There are some caveats that arise from our use of randomness, but even then, in the limit as the critical section length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our experiments show that the lock becomes exactly as fair as a FIFO lock for any critical section that is 1ms or longer. The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair. ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each other via a token. unparkOne() can pass a token, which parkConditionally() will return. This change also makes parkConditionally() a lot more precise about when it was unparked due to a call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is guaranteed to report that it was unparked rather than timing out, and that thread is guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing fair unlocking. In that case, unlock() will not release the lock, and lock() will know that it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock can make a decision about unlock strategy and inject a token while it has complete knowledge over the state of the queue. As such, it's not immediately obvious how to implement this algorithm on top of futexes. You really need ParkingLot! WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(), which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms at random. When a dequeue happens and there are threads that actually get dequeued, we check if the time since the last unfair unlock (the last time timeToBeFair was set to true) is more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout. This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to happen at least once per millisecond. It will happen at 2 KHz on average. If there are collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment of everyone else. Randomness ensures that we aren't too fair to any one thread. Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to allow only one thread to hold the lock during a whole second in which each thread is holding the lock for 1ms at a time. This is because in a barging lock, releasing a lock after holding it for 1ms and then reacquiring it immediately virtually ensures that none of the other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock handles this case like a champ: each thread gets equal turns. Here's some data. If we launch 10 threads and have each of them run for 1 second while repeatedly holding a critical section for 1ms, then here's how many times each thread gets to hold the lock using the old WTF::Lock algorithm: 799, 6, 1, 1, 1, 1, 1, 1, 1, 1 One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock becomes totally fair: 80, 79, 79, 79, 79, 79, 79, 80, 80, 79 I don't know of anyone creating such an automatically-fair adaptive lock before, so I think that this is a pretty awesome advancement to the state of the art! This change is good for three reasons: - We do have long critical sections in WebKit and we don't want to have to worry about starvation. This reduces the likelihood that we will see starvation due to our lock strategy. - I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or lockFairly() or some moral equivalent for the scavenger thread. - If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability to unlock and relock without barging. * benchmarks/LockFairnessTest.cpp: (main): * benchmarks/ToyLocks.h: * wtf/Condition.h: (WTF::ConditionBase::waitUntil): (WTF::ConditionBase::notifyOne): * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): * wtf/Lock.h: (WTF::LockBase::try_lock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (WTF::LockBase::isFullyReset): * wtf/ParkingLot.cpp: (WTF::ParkingLot::parkConditionallyImpl): (WTF::ParkingLot::unparkOne): (WTF::ParkingLot::unparkOneImpl): (WTF::ParkingLot::unparkAll): * wtf/ParkingLot.h: (WTF::ParkingLot::parkConditionally): (WTF::ParkingLot::compareAndPark): (WTF::ParkingLot::unparkOne): Tools: * TestWebKitAPI/Tests/WTF/ParkingLot.cpp: Canonical link: https://commits.webkit.org/178039@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
Lightweight locks should be adaptive https://bugs.webkit.org/show_bug.cgi?id=147545 Reviewed by Geoffrey Garen. Source/JavaScriptCore: * dfg/DFGCommon.cpp: (JSC::DFG::startCrashing): * heap/CopiedBlock.h: (JSC::CopiedBlock::workListLock): * heap/CopiedBlockInlines.h: (JSC::CopiedBlock::shouldReportLiveBytes): (JSC::CopiedBlock::reportLiveBytes): * heap/CopiedSpace.cpp: (JSC::CopiedSpace::doneFillingBlock): * heap/CopiedSpace.h: (JSC::CopiedSpace::CopiedGeneration::CopiedGeneration): * heap/CopiedSpaceInlines.h: (JSC::CopiedSpace::recycleEvacuatedBlock): * heap/GCThreadSharedData.cpp: (JSC::GCThreadSharedData::didStartCopying): * heap/GCThreadSharedData.h: (JSC::GCThreadSharedData::getNextBlocksToCopy): * heap/ListableHandler.h: (JSC::ListableHandler::List::addThreadSafe): (JSC::ListableHandler::List::addNotThreadSafe): * heap/MachineStackMarker.cpp: (JSC::MachineThreads::tryCopyOtherThreadStacks): * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::copyLater): * parser/SourceProvider.cpp: (JSC::SourceProvider::~SourceProvider): (JSC::SourceProvider::getID): * profiler/ProfilerDatabase.cpp: (JSC::Profiler::Database::addDatabaseToAtExit): (JSC::Profiler::Database::removeDatabaseFromAtExit): (JSC::Profiler::Database::removeFirstAtExitDatabase): * runtime/TypeProfilerLog.h: Source/WebCore: * bindings/objc/WebScriptObject.mm: (WebCore::getJSWrapper): (WebCore::addJSWrapper): (WebCore::removeJSWrapper): (WebCore::removeJSWrapperIfRetainCountOne): * platform/audio/mac/CARingBuffer.cpp: (WebCore::CARingBuffer::setCurrentFrameBounds): (WebCore::CARingBuffer::getCurrentFrameBounds): * platform/audio/mac/CARingBuffer.h: * platform/ios/wak/WAKWindow.mm: (-[WAKWindow setExposedScrollViewRect:]): (-[WAKWindow exposedScrollViewRect]): Source/WebKit2: * WebProcess/WebPage/EventDispatcher.cpp: (WebKit::EventDispatcher::clearQueuedTouchEventsForPage): (WebKit::EventDispatcher::getQueuedTouchEventsForPage): (WebKit::EventDispatcher::touchEvent): (WebKit::EventDispatcher::dispatchTouchEvents): * WebProcess/WebPage/EventDispatcher.h: * WebProcess/WebPage/ViewUpdateDispatcher.cpp: (WebKit::ViewUpdateDispatcher::visibleContentRectUpdate): (WebKit::ViewUpdateDispatcher::dispatchVisibleContentRectUpdate): * WebProcess/WebPage/ViewUpdateDispatcher.h: Source/WTF: A common idiom in WebKit is to use spinlocks. We use them because the lock acquisition overhead is lower than system locks and because they take dramatically less space than system locks. The speed and space advantages of spinlocks can be astonishing: an uncontended spinlock acquire is up to 10x faster and under microcontention - short critical section with two or more threads taking turns - spinlocks are up to 100x faster. Spinlocks take only 1 byte or 4 bytes depending on the flavor, while system locks take 64 bytes or more. Clearly, WebKit should continue to avoid system locks - they are just far too slow and far too big. But there is a problem with this idiom. System lock implementations will sleep a thread when it attempts to acquire a lock that is held, while spinlocks will cause the thread to burn CPU. In WebKit spinlocks, the thread will repeatedly call sched_yield(). This is awesome for microcontention, but awful when the lock will not be released for a while. In fact, when critical sections take tens of microseconds or more, the CPU time cost of our spinlocks is almost 100x more than the CPU time cost of a system lock. This case doesn't arise too frequently in our current uses of spinlocks, but that's probably because right now there are places where we make a conscious decision to use system locks - even though they use more memory and are slower - because we don't want to waste CPU cycles when a thread has to wait a while to acquire the lock. The solution is to just implement a modern adaptive mutex in WTF. Luckily, this isn't a new concept. This patch implements a mutex that is reminiscent of the kinds of low-overhead locks that JVMs use. The actual implementation here is inspired by some of the ideas from [1]. The idea is simple: the fast path is an inlined CAS to immediately acquire a lock that isn't held, the slow path tries some number of spins to acquire the lock, and if that fails, the thread is put on a queue and put to sleep. The queue is made up of statically allocated thread nodes and the lock itself is a tagged pointer: either it is just bits telling us the complete lock state (not held or held) or it is a pointer to the head of a queue of threads waiting to acquire the lock. This approach gives WTF::Lock three different levels of adaptation: an inlined fast path if the lock is not contended, a short burst of spinning for microcontention, and a full-blown queue for critical sections that are held for a long time. On a locking microbenchmark, this new Lock exhibits the following performance characteristics: - Lock+unlock on an uncontended no-op critical section: 2x slower than SpinLock and 3x faster than a system mutex. - Lock+unlock on a contended no-op critical section: 2x slower than SpinLock and 100x faster than a system mutex. - CPU time spent in lock() on a lock held for a while: same as system mutex, 90x less than a SpinLock. - Memory usage: sizeof(void*), so on 64-bit it's 8x less than a system mutex but 2x worse than a SpinLock. This patch replaces all uses of SpinLock with Lock, since our critical sections are not no-ops so if you do basically anything in your critical section, the Lock overhead will be invisible. Also, in all places where we used SpinLock, we could tolerate 8 bytes of overhead instead of 4. Performance benchmarking using JSC macrobenchmarks shows no difference, which is as it should be: the purpose of this change is to reduce CPU time wasted, not wallclock time. This patch doesn't replace any uses of ByteSpinLock, since we expect that the space benefits of having a lock that just uses a byte are still better than the CPU wastage benefits of Lock. But, this work will enable some future work to create locks that will fit in just 1.6 bits: https://bugs.webkit.org/show_bug.cgi?id=147665. Rolling this back in after fixing Lock::unlockSlow() for architectures that have a truly weak CAS. Since the Lock::unlock() fast path can go to slow path spuriously, it may go there even if there aren't any threads on the Lock's queue. So, unlockSlow() must be able to deal with the possibility of a null queue head. [1] http://www.filpizlo.com/papers/pizlo-pppj2011-fable.pdf * WTF.vcxproj/WTF.vcxproj: * WTF.xcodeproj/project.pbxproj: * benchmarks: Added. * benchmarks/LockSpeedTest.cpp: Added. (main): * wtf/Atomics.h: (WTF::Atomic::compareExchangeWeak): (WTF::Atomic::compareExchangeStrong): * wtf/CMakeLists.txt: * wtf/Lock.cpp: Added. (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): * wtf/Lock.h: Added. (WTF::LockBase::lock): (WTF::LockBase::unlock): (WTF::LockBase::isHeld): (WTF::LockBase::isLocked): (WTF::Lock::Lock): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocatorHandle::shrink): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): (WTF::MetaAllocator::addFreshFreeSpace): (WTF::MetaAllocator::debugFreeSpaceSize): * wtf/MetaAllocator.h: * wtf/SpinLock.h: * wtf/ThreadingPthreads.cpp: * wtf/ThreadingWin.cpp: * wtf/text/AtomicString.cpp: * wtf/text/AtomicStringImpl.cpp: (WTF::AtomicStringTableLocker::AtomicStringTableLocker): Tools: * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.vcxproj/TestWebKitAPI.vcxproj: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Lock.cpp: Added. (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/165908@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188169 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-07 22:38:59 +00:00
bool isHeld() const
{
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
return DefaultLockAlgorithm::isLocked(m_byte);
Lightweight locks should be adaptive https://bugs.webkit.org/show_bug.cgi?id=147545 Reviewed by Geoffrey Garen. Source/JavaScriptCore: * dfg/DFGCommon.cpp: (JSC::DFG::startCrashing): * heap/CopiedBlock.h: (JSC::CopiedBlock::workListLock): * heap/CopiedBlockInlines.h: (JSC::CopiedBlock::shouldReportLiveBytes): (JSC::CopiedBlock::reportLiveBytes): * heap/CopiedSpace.cpp: (JSC::CopiedSpace::doneFillingBlock): * heap/CopiedSpace.h: (JSC::CopiedSpace::CopiedGeneration::CopiedGeneration): * heap/CopiedSpaceInlines.h: (JSC::CopiedSpace::recycleEvacuatedBlock): * heap/GCThreadSharedData.cpp: (JSC::GCThreadSharedData::didStartCopying): * heap/GCThreadSharedData.h: (JSC::GCThreadSharedData::getNextBlocksToCopy): * heap/ListableHandler.h: (JSC::ListableHandler::List::addThreadSafe): (JSC::ListableHandler::List::addNotThreadSafe): * heap/MachineStackMarker.cpp: (JSC::MachineThreads::tryCopyOtherThreadStacks): * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::copyLater): * parser/SourceProvider.cpp: (JSC::SourceProvider::~SourceProvider): (JSC::SourceProvider::getID): * profiler/ProfilerDatabase.cpp: (JSC::Profiler::Database::addDatabaseToAtExit): (JSC::Profiler::Database::removeDatabaseFromAtExit): (JSC::Profiler::Database::removeFirstAtExitDatabase): * runtime/TypeProfilerLog.h: Source/WebCore: * bindings/objc/WebScriptObject.mm: (WebCore::getJSWrapper): (WebCore::addJSWrapper): (WebCore::removeJSWrapper): (WebCore::removeJSWrapperIfRetainCountOne): * platform/audio/mac/CARingBuffer.cpp: (WebCore::CARingBuffer::setCurrentFrameBounds): (WebCore::CARingBuffer::getCurrentFrameBounds): * platform/audio/mac/CARingBuffer.h: * platform/ios/wak/WAKWindow.mm: (-[WAKWindow setExposedScrollViewRect:]): (-[WAKWindow exposedScrollViewRect]): Source/WebKit2: * WebProcess/WebPage/EventDispatcher.cpp: (WebKit::EventDispatcher::clearQueuedTouchEventsForPage): (WebKit::EventDispatcher::getQueuedTouchEventsForPage): (WebKit::EventDispatcher::touchEvent): (WebKit::EventDispatcher::dispatchTouchEvents): * WebProcess/WebPage/EventDispatcher.h: * WebProcess/WebPage/ViewUpdateDispatcher.cpp: (WebKit::ViewUpdateDispatcher::visibleContentRectUpdate): (WebKit::ViewUpdateDispatcher::dispatchVisibleContentRectUpdate): * WebProcess/WebPage/ViewUpdateDispatcher.h: Source/WTF: A common idiom in WebKit is to use spinlocks. We use them because the lock acquisition overhead is lower than system locks and because they take dramatically less space than system locks. The speed and space advantages of spinlocks can be astonishing: an uncontended spinlock acquire is up to 10x faster and under microcontention - short critical section with two or more threads taking turns - spinlocks are up to 100x faster. Spinlocks take only 1 byte or 4 bytes depending on the flavor, while system locks take 64 bytes or more. Clearly, WebKit should continue to avoid system locks - they are just far too slow and far too big. But there is a problem with this idiom. System lock implementations will sleep a thread when it attempts to acquire a lock that is held, while spinlocks will cause the thread to burn CPU. In WebKit spinlocks, the thread will repeatedly call sched_yield(). This is awesome for microcontention, but awful when the lock will not be released for a while. In fact, when critical sections take tens of microseconds or more, the CPU time cost of our spinlocks is almost 100x more than the CPU time cost of a system lock. This case doesn't arise too frequently in our current uses of spinlocks, but that's probably because right now there are places where we make a conscious decision to use system locks - even though they use more memory and are slower - because we don't want to waste CPU cycles when a thread has to wait a while to acquire the lock. The solution is to just implement a modern adaptive mutex in WTF. Luckily, this isn't a new concept. This patch implements a mutex that is reminiscent of the kinds of low-overhead locks that JVMs use. The actual implementation here is inspired by some of the ideas from [1]. The idea is simple: the fast path is an inlined CAS to immediately acquire a lock that isn't held, the slow path tries some number of spins to acquire the lock, and if that fails, the thread is put on a queue and put to sleep. The queue is made up of statically allocated thread nodes and the lock itself is a tagged pointer: either it is just bits telling us the complete lock state (not held or held) or it is a pointer to the head of a queue of threads waiting to acquire the lock. This approach gives WTF::Lock three different levels of adaptation: an inlined fast path if the lock is not contended, a short burst of spinning for microcontention, and a full-blown queue for critical sections that are held for a long time. On a locking microbenchmark, this new Lock exhibits the following performance characteristics: - Lock+unlock on an uncontended no-op critical section: 2x slower than SpinLock and 3x faster than a system mutex. - Lock+unlock on a contended no-op critical section: 2x slower than SpinLock and 100x faster than a system mutex. - CPU time spent in lock() on a lock held for a while: same as system mutex, 90x less than a SpinLock. - Memory usage: sizeof(void*), so on 64-bit it's 8x less than a system mutex but 2x worse than a SpinLock. This patch replaces all uses of SpinLock with Lock, since our critical sections are not no-ops so if you do basically anything in your critical section, the Lock overhead will be invisible. Also, in all places where we used SpinLock, we could tolerate 8 bytes of overhead instead of 4. Performance benchmarking using JSC macrobenchmarks shows no difference, which is as it should be: the purpose of this change is to reduce CPU time wasted, not wallclock time. This patch doesn't replace any uses of ByteSpinLock, since we expect that the space benefits of having a lock that just uses a byte are still better than the CPU wastage benefits of Lock. But, this work will enable some future work to create locks that will fit in just 1.6 bits: https://bugs.webkit.org/show_bug.cgi?id=147665. Rolling this back in after fixing Lock::unlockSlow() for architectures that have a truly weak CAS. Since the Lock::unlock() fast path can go to slow path spuriously, it may go there even if there aren't any threads on the Lock's queue. So, unlockSlow() must be able to deal with the possibility of a null queue head. [1] http://www.filpizlo.com/papers/pizlo-pppj2011-fable.pdf * WTF.vcxproj/WTF.vcxproj: * WTF.xcodeproj/project.pbxproj: * benchmarks: Added. * benchmarks/LockSpeedTest.cpp: Added. (main): * wtf/Atomics.h: (WTF::Atomic::compareExchangeWeak): (WTF::Atomic::compareExchangeStrong): * wtf/CMakeLists.txt: * wtf/Lock.cpp: Added. (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): * wtf/Lock.h: Added. (WTF::LockBase::lock): (WTF::LockBase::unlock): (WTF::LockBase::isHeld): (WTF::LockBase::isLocked): (WTF::Lock::Lock): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocatorHandle::shrink): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): (WTF::MetaAllocator::addFreshFreeSpace): (WTF::MetaAllocator::debugFreeSpaceSize): * wtf/MetaAllocator.h: * wtf/SpinLock.h: * wtf/ThreadingPthreads.cpp: * wtf/ThreadingWin.cpp: * wtf/text/AtomicString.cpp: * wtf/text/AtomicStringImpl.cpp: (WTF::AtomicStringTableLocker::AtomicStringTableLocker): Tools: * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.vcxproj/TestWebKitAPI.vcxproj: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Lock.cpp: Added. (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/165908@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188169 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-07 22:38:59 +00:00
}
bool isLocked() const
{
return isHeld();
}
REGRESSION (r219895-219897): Number of leaks on Open Source went from 9240 to 235983 and is now at 302372 https://bugs.webkit.org/show_bug.cgi?id=175083 Reviewed by Oliver Hunt. Source/JavaScriptCore: This fixes the leak by making MarkedBlock::specializedSweep call destructors when the block is empty, even if we are using the pop path. Also, this fixes HeapCellInlines.h to no longer include MarkedBlockInlines.h. That's pretty important, since MarkedBlockInlines.h is the GC's internal guts - we don't want to have to recompile the world just because we changed it. Finally, this adds a new testing SPI for waiting for all VMs to finish destructing. This makes it easier to debug leaks. * bytecode/AccessCase.cpp: * bytecode/PolymorphicAccess.cpp: * heap/HeapCell.cpp: (JSC::HeapCell::isLive): * heap/HeapCellInlines.h: (JSC::HeapCell::isLive): Deleted. * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateWithoutCollecting): (JSC::MarkedAllocator::endMarking): * heap/MarkedBlockInlines.h: (JSC::MarkedBlock::Handle::specializedSweep): * jit/AssemblyHelpers.cpp: * jit/Repatch.cpp: * runtime/TestRunnerUtils.h: * runtime/VM.cpp: (JSC::waitForVMDestruction): (JSC::VM::~VM): Source/WTF: Adds a classic ReadWriteLock class. I wrote my own because I can never remember if the pthread one is guaranted to bias in favor of writers or not. * WTF.xcodeproj/project.pbxproj: * wtf/Condition.h: (WTF::ConditionBase::construct): (WTF::Condition::Condition): * wtf/Lock.h: (WTF::LockBase::construct): (WTF::Lock::Lock): * wtf/ReadWriteLock.cpp: Added. (WTF::ReadWriteLockBase::construct): (WTF::ReadWriteLockBase::readLock): (WTF::ReadWriteLockBase::readUnlock): (WTF::ReadWriteLockBase::writeLock): (WTF::ReadWriteLockBase::writeUnlock): * wtf/ReadWriteLock.h: Added. (WTF::ReadWriteLockBase::ReadLock::tryLock): (WTF::ReadWriteLockBase::ReadLock::lock): (WTF::ReadWriteLockBase::ReadLock::unlock): (WTF::ReadWriteLockBase::WriteLock::tryLock): (WTF::ReadWriteLockBase::WriteLock::lock): (WTF::ReadWriteLockBase::WriteLock::unlock): (WTF::ReadWriteLockBase::read): (WTF::ReadWriteLockBase::write): (WTF::ReadWriteLock::ReadWriteLock): Tools: Leaks results are super confusing if leaks runs while some VMs are destructing. This calls a new SPI to wait for VM destructions to finish before running the next test. This makes it easier to understand leaks results from workers tests, and leads to fewer reported leaks. * DumpRenderTree/mac/DumpRenderTree.mm: (runTest): Canonical link: https://commits.webkit.org/191978@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@220322 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-08-06 04:43:37 +00:00
private:
WTF::Lock should not suffer from the thundering herd https://bugs.webkit.org/show_bug.cgi?id=147947 Reviewed by Geoffrey Garen. Source/WTF: This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with doing this is that it's not obvious after calling unparkOne() if there are any other threads that are still parked on the lock's queue. If we assume that there are and leave the hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that if there aren't actually any threads parked. On the other hand, if we assume that there aren't any threads parked and clear the hasParkedBit, then if there actually were some threads parked, then they may never be awoken since future calls to unlock() won't take slow path and so won't call unparkOne(). In other words, we need a way to be very precise about when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case that we clear the bit just as some thread gets parked on the queue. A similar problem arises in futexes, and one of the solutions is to have a thread that acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked thread runs, then that barging thread will not know that there are threads parked. This could increase the severity of barging. Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security issues and so we can expose callbacks while ParkingLot is holding its internal locks. This change does exactly that for unparkOne(). The new variant of unparkOne() will call a user function while the queue from which we are unparking is locked. The callback is told basic stats about the queue: did we unpark a thread this time, and could there be more threads to unpark in the future. The callback runs while it's impossible for the queue state to change, since the ParkingLot's internal locks for the queue is held. This means that Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock inside the callback from unparkOne(). This takes care of the thundering herd problem while also reducing the greed that arises from barging threads. This required some careful reworking of the ParkingLot algorithm. The first thing I noticed was that the ThreadData::shouldPark flag was useless, since it's set exactly when ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create both hashtables and buckets, since the "callback is called while queue is locked" invariant requires that we didn't exit early due to the hashtable or bucket not being present. Note that all of this is done in such a way that the old unparkOne() and unparkAll() don't have to create any buckets, though they now may create the hashtable. We don't care as much about the hashtable being created by unpark since it's just such an unlikely scenario and it would only happen once. This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that benchmark. * benchmarks/LockSpeedTest.cpp: * wtf/Lock.cpp: (WTF::LockBase::unlockSlow): * wtf/Lock.h: (WTF::LockBase::isLocked): (WTF::LockBase::isFullyReset): * wtf/ParkingLot.cpp: (WTF::ParkingLot::parkConditionally): (WTF::ParkingLot::unparkOne): (WTF::ParkingLot::unparkAll): * wtf/ParkingLot.h: * wtf/WordLock.h: (WTF::WordLock::isLocked): (WTF::WordLock::isFullyReset): Tools: Add testing that checks that locks return to a pristine state after contention is over. * TestWebKitAPI/Tests/WTF/Lock.cpp: (TestWebKitAPI::LockInspector::isFullyReset): (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/166072@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
friend struct TestWebKitAPI::LockInspector;
Use constexpr instead of const in symbol definitions that are obviously constexpr. https://bugs.webkit.org/show_bug.cgi?id=201879 Rubber-stamped by Joseph Pecoraro. Source/bmalloc: * bmalloc/AvailableMemory.cpp: * bmalloc/IsoTLS.h: * bmalloc/Map.h: * bmalloc/Mutex.cpp: (bmalloc::Mutex::lockSlowCase): * bmalloc/PerThread.h: * bmalloc/Vector.h: * bmalloc/Zone.h: Source/JavaScriptCore: const may require external storage (at the compiler's whim) though these currently do not. constexpr makes it clear that the value is a literal constant that can be inlined. In most cases in the code, when we say static const, we actually mean static constexpr. I'm changing the code to reflect this. * API/JSAPIValueWrapper.h: * API/JSCallbackConstructor.h: * API/JSCallbackObject.h: * API/JSContextRef.cpp: * API/JSWrapperMap.mm: * API/tests/CompareAndSwapTest.cpp: * API/tests/TypedArrayCTest.cpp: * API/tests/testapi.mm: (testObjectiveCAPIMain): * KeywordLookupGenerator.py: (Trie.printAsC): * assembler/ARMv7Assembler.h: * assembler/AssemblerBuffer.h: * assembler/AssemblerCommon.h: * assembler/MacroAssembler.h: * assembler/MacroAssemblerARM64.h: * assembler/MacroAssemblerARM64E.h: * assembler/MacroAssemblerARMv7.h: * assembler/MacroAssemblerCodeRef.h: * assembler/MacroAssemblerMIPS.h: * assembler/MacroAssemblerX86.h: * assembler/MacroAssemblerX86Common.h: (JSC::MacroAssemblerX86Common::absDouble): (JSC::MacroAssemblerX86Common::negateDouble): * assembler/MacroAssemblerX86_64.h: * assembler/X86Assembler.h: * b3/B3Bank.h: * b3/B3CheckSpecial.h: * b3/B3DuplicateTails.cpp: * b3/B3EliminateCommonSubexpressions.cpp: * b3/B3FixSSA.cpp: * b3/B3FoldPathConstants.cpp: * b3/B3InferSwitches.cpp: * b3/B3Kind.h: * b3/B3LowerToAir.cpp: * b3/B3NativeTraits.h: * b3/B3ReduceDoubleToFloat.cpp: * b3/B3ReduceLoopStrength.cpp: * b3/B3ReduceStrength.cpp: * b3/B3ValueKey.h: * b3/air/AirAllocateRegistersByGraphColoring.cpp: * b3/air/AirAllocateStackByGraphColoring.cpp: * b3/air/AirArg.h: * b3/air/AirCCallSpecial.h: * b3/air/AirEmitShuffle.cpp: * b3/air/AirFixObviousSpills.cpp: * b3/air/AirFormTable.h: * b3/air/AirLowerAfterRegAlloc.cpp: * b3/air/AirPrintSpecial.h: * b3/air/AirStackAllocation.cpp: * b3/air/AirTmp.h: * b3/testb3_6.cpp: (testInterpreter): * bytecode/AccessCase.cpp: * bytecode/CallLinkStatus.cpp: * bytecode/CallVariant.h: * bytecode/CodeBlock.h: * bytecode/CodeOrigin.h: * bytecode/DFGExitProfile.h: * bytecode/DirectEvalCodeCache.h: * bytecode/ExecutableToCodeBlockEdge.h: * bytecode/GetterSetterAccessCase.cpp: * bytecode/LazyOperandValueProfile.h: * bytecode/ObjectPropertyCondition.h: * bytecode/ObjectPropertyConditionSet.cpp: * bytecode/PolymorphicAccess.cpp: * bytecode/PropertyCondition.h: * bytecode/SpeculatedType.h: * bytecode/StructureStubInfo.cpp: * bytecode/UnlinkedCodeBlock.cpp: (JSC::UnlinkedCodeBlock::typeProfilerExpressionInfoForBytecodeOffset): * bytecode/UnlinkedCodeBlock.h: * bytecode/UnlinkedEvalCodeBlock.h: * bytecode/UnlinkedFunctionCodeBlock.h: * bytecode/UnlinkedFunctionExecutable.h: * bytecode/UnlinkedModuleProgramCodeBlock.h: * bytecode/UnlinkedProgramCodeBlock.h: * bytecode/ValueProfile.h: * bytecode/VirtualRegister.h: * bytecode/Watchpoint.h: * bytecompiler/BytecodeGenerator.h: * bytecompiler/Label.h: * bytecompiler/NodesCodegen.cpp: (JSC::ThisNode::emitBytecode): * bytecompiler/RegisterID.h: * debugger/Breakpoint.h: * debugger/DebuggerParseData.cpp: * debugger/DebuggerPrimitives.h: * debugger/DebuggerScope.h: * dfg/DFGAbstractHeap.h: * dfg/DFGAbstractValue.h: * dfg/DFGArgumentsEliminationPhase.cpp: * dfg/DFGByteCodeParser.cpp: * dfg/DFGCSEPhase.cpp: * dfg/DFGCommon.h: * dfg/DFGCompilationKey.h: * dfg/DFGDesiredGlobalProperty.h: * dfg/DFGEdgeDominates.h: * dfg/DFGEpoch.h: * dfg/DFGForAllKills.h: (JSC::DFG::forAllKilledNodesAtNodeIndex): * dfg/DFGGraph.cpp: (JSC::DFG::Graph::isLiveInBytecode): * dfg/DFGHeapLocation.h: * dfg/DFGInPlaceAbstractState.cpp: * dfg/DFGIntegerCheckCombiningPhase.cpp: * dfg/DFGIntegerRangeOptimizationPhase.cpp: * dfg/DFGInvalidationPointInjectionPhase.cpp: * dfg/DFGLICMPhase.cpp: * dfg/DFGLazyNode.h: * dfg/DFGMinifiedID.h: * dfg/DFGMovHintRemovalPhase.cpp: * dfg/DFGNodeFlowProjection.h: * dfg/DFGNodeType.h: * dfg/DFGObjectAllocationSinkingPhase.cpp: * dfg/DFGPhantomInsertionPhase.cpp: * dfg/DFGPromotedHeapLocation.h: * dfg/DFGPropertyTypeKey.h: * dfg/DFGPureValue.h: * dfg/DFGPutStackSinkingPhase.cpp: * dfg/DFGRegisterBank.h: * dfg/DFGSSAConversionPhase.cpp: * dfg/DFGSSALoweringPhase.cpp: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::compileDoubleRep): (JSC::DFG::compileClampDoubleToByte): (JSC::DFG::SpeculativeJIT::compileArithRounding): (JSC::DFG::compileArithPowIntegerFastPath): (JSC::DFG::SpeculativeJIT::compileArithPow): (JSC::DFG::SpeculativeJIT::emitBinarySwitchStringRecurse): * dfg/DFGStackLayoutPhase.cpp: * dfg/DFGStoreBarrierInsertionPhase.cpp: * dfg/DFGStrengthReductionPhase.cpp: * dfg/DFGStructureAbstractValue.h: * dfg/DFGVarargsForwardingPhase.cpp: * dfg/DFGVariableEventStream.cpp: (JSC::DFG::VariableEventStream::reconstruct const): * dfg/DFGWatchpointCollectionPhase.cpp: * disassembler/ARM64/A64DOpcode.h: * ftl/FTLLocation.h: * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compileArithRandom): * ftl/FTLSlowPathCall.cpp: * ftl/FTLSlowPathCallKey.h: * heap/CellContainer.h: * heap/CellState.h: * heap/ConservativeRoots.h: * heap/GCSegmentedArray.h: * heap/HandleBlock.h: * heap/Heap.cpp: (JSC::Heap::updateAllocationLimits): * heap/Heap.h: * heap/HeapSnapshot.h: * heap/HeapUtil.h: (JSC::HeapUtil::findGCObjectPointersForMarking): * heap/IncrementalSweeper.cpp: * heap/LargeAllocation.h: * heap/MarkedBlock.cpp: * heap/Strong.h: * heap/VisitRaceKey.h: * heap/Weak.h: * heap/WeakBlock.h: * inspector/JSInjectedScriptHost.h: * inspector/JSInjectedScriptHostPrototype.h: * inspector/JSJavaScriptCallFrame.h: * inspector/JSJavaScriptCallFramePrototype.h: * inspector/agents/InspectorConsoleAgent.cpp: * inspector/agents/InspectorRuntimeAgent.cpp: (Inspector::InspectorRuntimeAgent::getRuntimeTypesForVariablesAtOffsets): * inspector/scripts/codegen/generate_cpp_protocol_types_header.py: (CppProtocolTypesHeaderGenerator._generate_versions): * inspector/scripts/tests/generic/expected/version.json-result: * interpreter/Interpreter.h: * interpreter/ShadowChicken.cpp: * jit/BinarySwitch.cpp: * jit/CallFrameShuffler.h: * jit/ExecutableAllocator.h: * jit/FPRInfo.h: * jit/GPRInfo.h: * jit/ICStats.h: * jit/JITThunks.h: * jit/Reg.h: * jit/RegisterSet.h: * jit/TempRegisterSet.h: * jsc.cpp: * parser/ASTBuilder.h: * parser/Nodes.h: * parser/SourceCodeKey.h: * parser/SyntaxChecker.h: * parser/VariableEnvironment.h: * profiler/ProfilerOrigin.h: * profiler/ProfilerOriginStack.h: * profiler/ProfilerUID.h: * runtime/AbstractModuleRecord.cpp: * runtime/ArrayBufferNeuteringWatchpointSet.h: * runtime/ArrayConstructor.h: * runtime/ArrayConventions.h: * runtime/ArrayIteratorPrototype.h: * runtime/ArrayPrototype.cpp: (JSC::setLength): * runtime/AsyncFromSyncIteratorPrototype.h: * runtime/AsyncGeneratorFunctionPrototype.h: * runtime/AsyncGeneratorPrototype.h: * runtime/AsyncIteratorPrototype.h: * runtime/AtomicsObject.cpp: * runtime/BigIntConstructor.h: * runtime/BigIntPrototype.h: * runtime/BooleanPrototype.h: * runtime/ClonedArguments.h: * runtime/CodeCache.h: * runtime/ControlFlowProfiler.h: * runtime/CustomGetterSetter.h: * runtime/DateConstructor.h: * runtime/DatePrototype.h: * runtime/DefinePropertyAttributes.h: * runtime/ErrorPrototype.h: * runtime/EvalExecutable.h: * runtime/Exception.h: * runtime/ExceptionHelpers.cpp: (JSC::invalidParameterInSourceAppender): (JSC::invalidParameterInstanceofSourceAppender): * runtime/ExceptionHelpers.h: * runtime/ExecutableBase.h: * runtime/FunctionExecutable.h: * runtime/FunctionRareData.h: * runtime/GeneratorPrototype.h: * runtime/GenericArguments.h: * runtime/GenericOffset.h: * runtime/GetPutInfo.h: * runtime/GetterSetter.h: * runtime/GlobalExecutable.h: * runtime/Identifier.h: * runtime/InspectorInstrumentationObject.h: * runtime/InternalFunction.h: * runtime/IntlCollatorConstructor.h: * runtime/IntlCollatorPrototype.h: * runtime/IntlDateTimeFormatConstructor.h: * runtime/IntlDateTimeFormatPrototype.h: * runtime/IntlNumberFormatConstructor.h: * runtime/IntlNumberFormatPrototype.h: * runtime/IntlObject.h: * runtime/IntlPluralRulesConstructor.h: * runtime/IntlPluralRulesPrototype.h: * runtime/IteratorPrototype.h: * runtime/JSArray.cpp: (JSC::JSArray::tryCreateUninitializedRestricted): * runtime/JSArray.h: * runtime/JSArrayBuffer.h: * runtime/JSArrayBufferView.h: * runtime/JSBigInt.h: * runtime/JSCJSValue.h: * runtime/JSCell.h: * runtime/JSCustomGetterSetterFunction.h: * runtime/JSDataView.h: * runtime/JSDataViewPrototype.h: * runtime/JSDestructibleObject.h: * runtime/JSFixedArray.h: * runtime/JSGenericTypedArrayView.h: * runtime/JSGlobalLexicalEnvironment.h: * runtime/JSGlobalObject.h: * runtime/JSImmutableButterfly.h: * runtime/JSInternalPromiseConstructor.h: * runtime/JSInternalPromiseDeferred.h: * runtime/JSInternalPromisePrototype.h: * runtime/JSLexicalEnvironment.h: * runtime/JSModuleEnvironment.h: * runtime/JSModuleLoader.h: * runtime/JSModuleNamespaceObject.h: * runtime/JSNonDestructibleProxy.h: * runtime/JSONObject.cpp: * runtime/JSONObject.h: * runtime/JSObject.h: * runtime/JSPromiseConstructor.h: * runtime/JSPromiseDeferred.h: * runtime/JSPromisePrototype.h: * runtime/JSPropertyNameEnumerator.h: * runtime/JSProxy.h: * runtime/JSScope.h: * runtime/JSScriptFetchParameters.h: * runtime/JSScriptFetcher.h: * runtime/JSSegmentedVariableObject.h: * runtime/JSSourceCode.h: * runtime/JSString.cpp: * runtime/JSString.h: * runtime/JSSymbolTableObject.h: * runtime/JSTemplateObjectDescriptor.h: * runtime/JSTypeInfo.h: * runtime/MapPrototype.h: * runtime/MinimumReservedZoneSize.h: * runtime/ModuleProgramExecutable.h: * runtime/NativeExecutable.h: * runtime/NativeFunction.h: * runtime/NativeStdFunctionCell.h: * runtime/NumberConstructor.h: * runtime/NumberPrototype.h: * runtime/ObjectConstructor.h: * runtime/ObjectPrototype.h: * runtime/ProgramExecutable.h: * runtime/PromiseDeferredTimer.cpp: * runtime/PropertyMapHashTable.h: * runtime/PropertyNameArray.h: (JSC::PropertyNameArray::add): * runtime/PrototypeKey.h: * runtime/ProxyConstructor.h: * runtime/ProxyObject.cpp: (JSC::ProxyObject::performGetOwnPropertyNames): * runtime/ProxyRevoke.h: * runtime/ReflectObject.h: * runtime/RegExp.h: * runtime/RegExpCache.h: * runtime/RegExpConstructor.h: * runtime/RegExpKey.h: * runtime/RegExpObject.h: * runtime/RegExpPrototype.h: * runtime/RegExpStringIteratorPrototype.h: * runtime/SamplingProfiler.cpp: * runtime/ScopedArgumentsTable.h: * runtime/ScriptExecutable.h: * runtime/SetPrototype.h: * runtime/SmallStrings.h: * runtime/SparseArrayValueMap.h: * runtime/StringConstructor.h: * runtime/StringIteratorPrototype.h: * runtime/StringObject.h: * runtime/StringPrototype.h: * runtime/Structure.h: * runtime/StructureChain.h: * runtime/StructureRareData.h: * runtime/StructureTransitionTable.h: * runtime/Symbol.h: * runtime/SymbolConstructor.h: * runtime/SymbolPrototype.h: * runtime/SymbolTable.h: * runtime/TemplateObjectDescriptor.h: * runtime/TypeProfiler.cpp: * runtime/TypeProfiler.h: * runtime/TypeProfilerLog.cpp: * runtime/VarOffset.h: * testRegExp.cpp: * tools/HeapVerifier.cpp: (JSC::HeapVerifier::checkIfRecorded): * tools/JSDollarVM.cpp: * wasm/WasmB3IRGenerator.cpp: * wasm/WasmBBQPlan.cpp: * wasm/WasmFaultSignalHandler.cpp: * wasm/WasmFunctionParser.h: * wasm/WasmOMGForOSREntryPlan.cpp: * wasm/WasmOMGPlan.cpp: * wasm/WasmPlan.cpp: * wasm/WasmSignature.cpp: * wasm/WasmSignature.h: * wasm/WasmWorklist.cpp: * wasm/js/JSWebAssembly.h: * wasm/js/JSWebAssemblyCodeBlock.h: * wasm/js/WebAssemblyCompileErrorConstructor.h: * wasm/js/WebAssemblyCompileErrorPrototype.h: * wasm/js/WebAssemblyFunction.h: * wasm/js/WebAssemblyInstanceConstructor.h: * wasm/js/WebAssemblyInstancePrototype.h: * wasm/js/WebAssemblyLinkErrorConstructor.h: * wasm/js/WebAssemblyLinkErrorPrototype.h: * wasm/js/WebAssemblyMemoryConstructor.h: * wasm/js/WebAssemblyMemoryPrototype.h: * wasm/js/WebAssemblyModuleConstructor.h: * wasm/js/WebAssemblyModulePrototype.h: * wasm/js/WebAssemblyRuntimeErrorConstructor.h: * wasm/js/WebAssemblyRuntimeErrorPrototype.h: * wasm/js/WebAssemblyTableConstructor.h: * wasm/js/WebAssemblyTablePrototype.h: * wasm/js/WebAssemblyToJSCallee.h: * yarr/Yarr.h: * yarr/YarrParser.h: * yarr/generateYarrCanonicalizeUnicode: Source/WebCore: No new tests. Covered by existing tests. * bindings/js/JSDOMConstructorBase.h: * bindings/js/JSDOMWindowProperties.h: * bindings/scripts/CodeGeneratorJS.pm: (GenerateHeader): (GeneratePrototypeDeclaration): * bindings/scripts/test/JS/JSTestActiveDOMObject.h: * bindings/scripts/test/JS/JSTestEnabledBySetting.h: * bindings/scripts/test/JS/JSTestEnabledForContext.h: * bindings/scripts/test/JS/JSTestEventTarget.h: * bindings/scripts/test/JS/JSTestGlobalObject.h: * bindings/scripts/test/JS/JSTestIndexedSetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestIndexedSetterThrowingException.h: * bindings/scripts/test/JS/JSTestIndexedSetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedAndIndexedSetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedAndIndexedSetterThrowingException.h: * bindings/scripts/test/JS/JSTestNamedAndIndexedSetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedDeleterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedDeleterThrowingException.h: * bindings/scripts/test/JS/JSTestNamedDeleterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedDeleterWithIndexedGetter.h: * bindings/scripts/test/JS/JSTestNamedGetterCallWith.h: * bindings/scripts/test/JS/JSTestNamedGetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedGetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedSetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedSetterThrowingException.h: * bindings/scripts/test/JS/JSTestNamedSetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedSetterWithIndexedGetter.h: * bindings/scripts/test/JS/JSTestNamedSetterWithIndexedGetterAndSetter.h: * bindings/scripts/test/JS/JSTestNamedSetterWithOverrideBuiltins.h: * bindings/scripts/test/JS/JSTestNamedSetterWithUnforgableProperties.h: * bindings/scripts/test/JS/JSTestNamedSetterWithUnforgablePropertiesAndOverrideBuiltins.h: * bindings/scripts/test/JS/JSTestObj.h: * bindings/scripts/test/JS/JSTestOverrideBuiltins.h: * bindings/scripts/test/JS/JSTestPluginInterface.h: * bindings/scripts/test/JS/JSTestTypedefs.h: * bridge/objc/objc_runtime.h: * bridge/runtime_array.h: * bridge/runtime_method.h: * bridge/runtime_object.h: Source/WebKit: * WebProcess/Plugins/Netscape/JSNPObject.h: Source/WTF: * wtf/Assertions.cpp: * wtf/AutomaticThread.cpp: * wtf/BitVector.h: * wtf/Bitmap.h: * wtf/BloomFilter.h: * wtf/Brigand.h: * wtf/CheckedArithmetic.h: * wtf/CrossThreadCopier.h: * wtf/CurrentTime.cpp: * wtf/DataLog.cpp: * wtf/DateMath.cpp: (WTF::daysFrom1970ToYear): * wtf/DeferrableRefCounted.h: * wtf/GetPtr.h: * wtf/HashFunctions.h: * wtf/HashMap.h: * wtf/HashTable.h: * wtf/HashTraits.h: * wtf/JSONValues.cpp: * wtf/JSONValues.h: * wtf/ListHashSet.h: * wtf/Lock.h: * wtf/LockAlgorithm.h: * wtf/LockAlgorithmInlines.h: (WTF::Hooks>::lockSlow): * wtf/Logger.h: * wtf/LoggerHelper.h: (WTF::LoggerHelper::childLogIdentifier const): * wtf/MainThread.cpp: * wtf/MetaAllocatorPtr.h: * wtf/MonotonicTime.h: * wtf/NaturalLoops.h: (WTF::NaturalLoops::NaturalLoops): * wtf/ObjectIdentifier.h: * wtf/RAMSize.cpp: * wtf/Ref.h: * wtf/RefPtr.h: * wtf/RetainPtr.h: * wtf/SchedulePair.h: * wtf/StackShot.h: * wtf/StdLibExtras.h: * wtf/TinyPtrSet.h: * wtf/URL.cpp: * wtf/URLHash.h: * wtf/URLParser.cpp: (WTF::URLParser::defaultPortForProtocol): * wtf/Vector.h: * wtf/VectorTraits.h: * wtf/WallTime.h: * wtf/WeakHashSet.h: * wtf/WordLock.h: * wtf/cocoa/CPUTimeCocoa.cpp: * wtf/cocoa/MemoryPressureHandlerCocoa.mm: * wtf/persistence/PersistentDecoder.h: * wtf/persistence/PersistentEncoder.h: * wtf/text/AtomStringHash.h: * wtf/text/CString.h: * wtf/text/StringBuilder.cpp: (WTF::expandedCapacity): * wtf/text/StringHash.h: * wtf/text/StringImpl.h: * wtf/text/StringToIntegerConversion.h: (WTF::toIntegralType): * wtf/text/SymbolRegistry.h: * wtf/text/TextStream.cpp: (WTF::hasFractions): * wtf/text/WTFString.h: * wtf/text/cocoa/TextBreakIteratorInternalICUCocoa.cpp: Canonical link: https://commits.webkit.org/215538@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@250005 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2019-09-18 00:36:19 +00:00
static constexpr uint8_t isHeldBit = 1;
static constexpr uint8_t hasParkedBit = 2;
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
Lightweight locks should be adaptive https://bugs.webkit.org/show_bug.cgi?id=147545 Reviewed by Geoffrey Garen. Source/JavaScriptCore: * dfg/DFGCommon.cpp: (JSC::DFG::startCrashing): * heap/CopiedBlock.h: (JSC::CopiedBlock::workListLock): * heap/CopiedBlockInlines.h: (JSC::CopiedBlock::shouldReportLiveBytes): (JSC::CopiedBlock::reportLiveBytes): * heap/CopiedSpace.cpp: (JSC::CopiedSpace::doneFillingBlock): * heap/CopiedSpace.h: (JSC::CopiedSpace::CopiedGeneration::CopiedGeneration): * heap/CopiedSpaceInlines.h: (JSC::CopiedSpace::recycleEvacuatedBlock): * heap/GCThreadSharedData.cpp: (JSC::GCThreadSharedData::didStartCopying): * heap/GCThreadSharedData.h: (JSC::GCThreadSharedData::getNextBlocksToCopy): * heap/ListableHandler.h: (JSC::ListableHandler::List::addThreadSafe): (JSC::ListableHandler::List::addNotThreadSafe): * heap/MachineStackMarker.cpp: (JSC::MachineThreads::tryCopyOtherThreadStacks): * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::copyLater): * parser/SourceProvider.cpp: (JSC::SourceProvider::~SourceProvider): (JSC::SourceProvider::getID): * profiler/ProfilerDatabase.cpp: (JSC::Profiler::Database::addDatabaseToAtExit): (JSC::Profiler::Database::removeDatabaseFromAtExit): (JSC::Profiler::Database::removeFirstAtExitDatabase): * runtime/TypeProfilerLog.h: Source/WebCore: * bindings/objc/WebScriptObject.mm: (WebCore::getJSWrapper): (WebCore::addJSWrapper): (WebCore::removeJSWrapper): (WebCore::removeJSWrapperIfRetainCountOne): * platform/audio/mac/CARingBuffer.cpp: (WebCore::CARingBuffer::setCurrentFrameBounds): (WebCore::CARingBuffer::getCurrentFrameBounds): * platform/audio/mac/CARingBuffer.h: * platform/ios/wak/WAKWindow.mm: (-[WAKWindow setExposedScrollViewRect:]): (-[WAKWindow exposedScrollViewRect]): Source/WebKit2: * WebProcess/WebPage/EventDispatcher.cpp: (WebKit::EventDispatcher::clearQueuedTouchEventsForPage): (WebKit::EventDispatcher::getQueuedTouchEventsForPage): (WebKit::EventDispatcher::touchEvent): (WebKit::EventDispatcher::dispatchTouchEvents): * WebProcess/WebPage/EventDispatcher.h: * WebProcess/WebPage/ViewUpdateDispatcher.cpp: (WebKit::ViewUpdateDispatcher::visibleContentRectUpdate): (WebKit::ViewUpdateDispatcher::dispatchVisibleContentRectUpdate): * WebProcess/WebPage/ViewUpdateDispatcher.h: Source/WTF: A common idiom in WebKit is to use spinlocks. We use them because the lock acquisition overhead is lower than system locks and because they take dramatically less space than system locks. The speed and space advantages of spinlocks can be astonishing: an uncontended spinlock acquire is up to 10x faster and under microcontention - short critical section with two or more threads taking turns - spinlocks are up to 100x faster. Spinlocks take only 1 byte or 4 bytes depending on the flavor, while system locks take 64 bytes or more. Clearly, WebKit should continue to avoid system locks - they are just far too slow and far too big. But there is a problem with this idiom. System lock implementations will sleep a thread when it attempts to acquire a lock that is held, while spinlocks will cause the thread to burn CPU. In WebKit spinlocks, the thread will repeatedly call sched_yield(). This is awesome for microcontention, but awful when the lock will not be released for a while. In fact, when critical sections take tens of microseconds or more, the CPU time cost of our spinlocks is almost 100x more than the CPU time cost of a system lock. This case doesn't arise too frequently in our current uses of spinlocks, but that's probably because right now there are places where we make a conscious decision to use system locks - even though they use more memory and are slower - because we don't want to waste CPU cycles when a thread has to wait a while to acquire the lock. The solution is to just implement a modern adaptive mutex in WTF. Luckily, this isn't a new concept. This patch implements a mutex that is reminiscent of the kinds of low-overhead locks that JVMs use. The actual implementation here is inspired by some of the ideas from [1]. The idea is simple: the fast path is an inlined CAS to immediately acquire a lock that isn't held, the slow path tries some number of spins to acquire the lock, and if that fails, the thread is put on a queue and put to sleep. The queue is made up of statically allocated thread nodes and the lock itself is a tagged pointer: either it is just bits telling us the complete lock state (not held or held) or it is a pointer to the head of a queue of threads waiting to acquire the lock. This approach gives WTF::Lock three different levels of adaptation: an inlined fast path if the lock is not contended, a short burst of spinning for microcontention, and a full-blown queue for critical sections that are held for a long time. On a locking microbenchmark, this new Lock exhibits the following performance characteristics: - Lock+unlock on an uncontended no-op critical section: 2x slower than SpinLock and 3x faster than a system mutex. - Lock+unlock on a contended no-op critical section: 2x slower than SpinLock and 100x faster than a system mutex. - CPU time spent in lock() on a lock held for a while: same as system mutex, 90x less than a SpinLock. - Memory usage: sizeof(void*), so on 64-bit it's 8x less than a system mutex but 2x worse than a SpinLock. This patch replaces all uses of SpinLock with Lock, since our critical sections are not no-ops so if you do basically anything in your critical section, the Lock overhead will be invisible. Also, in all places where we used SpinLock, we could tolerate 8 bytes of overhead instead of 4. Performance benchmarking using JSC macrobenchmarks shows no difference, which is as it should be: the purpose of this change is to reduce CPU time wasted, not wallclock time. This patch doesn't replace any uses of ByteSpinLock, since we expect that the space benefits of having a lock that just uses a byte are still better than the CPU wastage benefits of Lock. But, this work will enable some future work to create locks that will fit in just 1.6 bits: https://bugs.webkit.org/show_bug.cgi?id=147665. Rolling this back in after fixing Lock::unlockSlow() for architectures that have a truly weak CAS. Since the Lock::unlock() fast path can go to slow path spuriously, it may go there even if there aren't any threads on the Lock's queue. So, unlockSlow() must be able to deal with the possibility of a null queue head. [1] http://www.filpizlo.com/papers/pizlo-pppj2011-fable.pdf * WTF.vcxproj/WTF.vcxproj: * WTF.xcodeproj/project.pbxproj: * benchmarks: Added. * benchmarks/LockSpeedTest.cpp: Added. (main): * wtf/Atomics.h: (WTF::Atomic::compareExchangeWeak): (WTF::Atomic::compareExchangeStrong): * wtf/CMakeLists.txt: * wtf/Lock.cpp: Added. (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): * wtf/Lock.h: Added. (WTF::LockBase::lock): (WTF::LockBase::unlock): (WTF::LockBase::isHeld): (WTF::LockBase::isLocked): (WTF::Lock::Lock): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocatorHandle::shrink): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): (WTF::MetaAllocator::addFreshFreeSpace): (WTF::MetaAllocator::debugFreeSpaceSize): * wtf/MetaAllocator.h: * wtf/SpinLock.h: * wtf/ThreadingPthreads.cpp: * wtf/ThreadingWin.cpp: * wtf/text/AtomicString.cpp: * wtf/text/AtomicStringImpl.cpp: (WTF::AtomicStringTableLocker::AtomicStringTableLocker): Tools: * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.vcxproj/TestWebKitAPI.vcxproj: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Lock.cpp: Added. (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/165908@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188169 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-07 22:38:59 +00:00
WTF_EXPORT_PRIVATE void lockSlow();
WTF_EXPORT_PRIVATE void unlockSlow();
WTF::Lock should be fair eventually https://bugs.webkit.org/show_bug.cgi?id=159384 Reviewed by Geoffrey Garen. Source/WTF: In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of locks makes them fast. That post presented lock fairness as a trade-off between two extremes: - Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a thread on the queue. If there was a thread on the queue, the lock is released and that thread is made runnable. That thread may then grab the lock, or some other thread may grab the lock first (it may barge). Usually, the barging thread is the thread that released the lock in the first place. This maximizes throughput but hurts fairness. There is no good theoretical bound on how unfair the lock may become, but empirical data suggests that it's fair enough for the cases we previously measured. - FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock() if there is a thread waiting. If there is a thread waiting, unlock() will make that thread runnable and inform it that it now holds the lock. This ensures perfect round-robin fairness and allows us to reason theoretically about how long it may take for a thread to grab the lock. For example, if we know that only N threads are running and each one may contend on a critical section, and each one may hold the lock for at most S seconds, then the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly in most cases. This is because for the common case of short critical sections, they force a context switch after each critical section if the lock is contended. This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging. Thanks to this new algorithm, you can now have both of these things at the same time. This change makes WTF::Lock eventually fair. We can almost (more on the caveats below) guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical sections that are longer than 1ms are always fair. For shorter critical sections, the amount of time that any thread waits is 1ms times the number of threads. There are some caveats that arise from our use of randomness, but even then, in the limit as the critical section length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our experiments show that the lock becomes exactly as fair as a FIFO lock for any critical section that is 1ms or longer. The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair. ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each other via a token. unparkOne() can pass a token, which parkConditionally() will return. This change also makes parkConditionally() a lot more precise about when it was unparked due to a call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is guaranteed to report that it was unparked rather than timing out, and that thread is guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing fair unlocking. In that case, unlock() will not release the lock, and lock() will know that it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock can make a decision about unlock strategy and inject a token while it has complete knowledge over the state of the queue. As such, it's not immediately obvious how to implement this algorithm on top of futexes. You really need ParkingLot! WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(), which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms at random. When a dequeue happens and there are threads that actually get dequeued, we check if the time since the last unfair unlock (the last time timeToBeFair was set to true) is more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout. This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to happen at least once per millisecond. It will happen at 2 KHz on average. If there are collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment of everyone else. Randomness ensures that we aren't too fair to any one thread. Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to allow only one thread to hold the lock during a whole second in which each thread is holding the lock for 1ms at a time. This is because in a barging lock, releasing a lock after holding it for 1ms and then reacquiring it immediately virtually ensures that none of the other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock handles this case like a champ: each thread gets equal turns. Here's some data. If we launch 10 threads and have each of them run for 1 second while repeatedly holding a critical section for 1ms, then here's how many times each thread gets to hold the lock using the old WTF::Lock algorithm: 799, 6, 1, 1, 1, 1, 1, 1, 1, 1 One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock becomes totally fair: 80, 79, 79, 79, 79, 79, 79, 80, 80, 79 I don't know of anyone creating such an automatically-fair adaptive lock before, so I think that this is a pretty awesome advancement to the state of the art! This change is good for three reasons: - We do have long critical sections in WebKit and we don't want to have to worry about starvation. This reduces the likelihood that we will see starvation due to our lock strategy. - I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or lockFairly() or some moral equivalent for the scavenger thread. - If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability to unlock and relock without barging. * benchmarks/LockFairnessTest.cpp: (main): * benchmarks/ToyLocks.h: * wtf/Condition.h: (WTF::ConditionBase::waitUntil): (WTF::ConditionBase::notifyOne): * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): * wtf/Lock.h: (WTF::LockBase::try_lock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (WTF::LockBase::isFullyReset): * wtf/ParkingLot.cpp: (WTF::ParkingLot::parkConditionallyImpl): (WTF::ParkingLot::unparkOne): (WTF::ParkingLot::unparkOneImpl): (WTF::ParkingLot::unparkAll): * wtf/ParkingLot.h: (WTF::ParkingLot::parkConditionally): (WTF::ParkingLot::compareAndPark): (WTF::ParkingLot::unparkOne): Tools: * TestWebKitAPI/Tests/WTF/ParkingLot.cpp: Canonical link: https://commits.webkit.org/178039@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
WTF_EXPORT_PRIVATE void unlockFairlySlow();
PerformanceTests: Concurrent GC should be stable enough to land enabled https://bugs.webkit.org/show_bug.cgi?id=164990 Reviewed by Geoffrey Garen. Made CDjs more configurable and refined the "large.js" configuration. I was using that one and the new "long.js" configuration to tune concurrent eden GCs. Added a new way of running Splay in browser, which using chartjs to plot the execution times of 2000 iterations. This includes the minified chartjs. * JetStream/Octane2/splay-detail.html: Added. * JetStream/cdjs/benchmark.js: (benchmarkImpl): (benchmark): * JetStream/cdjs/long.js: Added. Source/JavaScriptCore: Concurrent GC should be stable enough to land enabled on X86_64 https://bugs.webkit.org/show_bug.cgi?id=164990 Reviewed by Geoffrey Garen. This fixes a ton of performance and correctness bugs revealed by getting the concurrent GC to be stable enough to land enabled. I had to redo the JSObject::visitChildren concurrency protocol again. This time I think it's even more correct than ever! This is an enormous win on JetStream/splay-latency and Octane/SplayLatency. It looks to be mostly neutral on everything else, though Speedometer is showing statistically weak signs of a slight regression. * API/JSAPIWrapperObject.mm: Added locking. (JSC::JSAPIWrapperObject::visitChildren): * API/JSCallbackObject.h: Added locking. (JSC::JSCallbackObjectData::visitChildren): (JSC::JSCallbackObjectData::JSPrivatePropertyMap::setPrivateProperty): (JSC::JSCallbackObjectData::JSPrivatePropertyMap::deletePrivateProperty): (JSC::JSCallbackObjectData::JSPrivatePropertyMap::visitChildren): * CMakeLists.txt: * JavaScriptCore.xcodeproj/project.pbxproj: * bytecode/CodeBlock.cpp: (JSC::CodeBlock::UnconditionalFinalizer::finalizeUnconditionally): This had a TOCTOU race on shouldJettisonDueToOldAge. (JSC::EvalCodeCache::visitAggregate): Moved to EvalCodeCache.cpp. * bytecode/DirectEvalCodeCache.cpp: Added. Outlined some functions and made them use locks. (JSC::DirectEvalCodeCache::setSlow): (JSC::DirectEvalCodeCache::clear): (JSC::DirectEvalCodeCache::visitAggregate): * bytecode/DirectEvalCodeCache.h: (JSC::DirectEvalCodeCache::set): (JSC::DirectEvalCodeCache::clear): Deleted. * bytecode/UnlinkedCodeBlock.cpp: Added locking. (JSC::UnlinkedCodeBlock::visitChildren): (JSC::UnlinkedCodeBlock::setInstructions): (JSC::UnlinkedCodeBlock::shrinkToFit): * bytecode/UnlinkedCodeBlock.h: Added locking. (JSC::UnlinkedCodeBlock::addRegExp): (JSC::UnlinkedCodeBlock::addConstant): (JSC::UnlinkedCodeBlock::addFunctionDecl): (JSC::UnlinkedCodeBlock::addFunctionExpr): (JSC::UnlinkedCodeBlock::createRareDataIfNecessary): (JSC::UnlinkedCodeBlock::shrinkToFit): Deleted. * debugger/Debugger.cpp: Use the right delete API. (JSC::Debugger::recompileAllJSFunctions): * dfg/DFGAbstractInterpreterInlines.h: (JSC::DFG::AbstractInterpreter<AbstractStateType>::executeEffects): Fix a pre-existing bug in ToFunction constant folding. * dfg/DFGClobberize.h: Add support for nuking. (JSC::DFG::clobberize): * dfg/DFGClobbersExitState.cpp: Add support for nuking. (JSC::DFG::clobbersExitState): * dfg/DFGFixupPhase.cpp: Add support for nuking. (JSC::DFG::FixupPhase::fixupNode): (JSC::DFG::FixupPhase::indexForChecks): (JSC::DFG::FixupPhase::originForCheck): (JSC::DFG::FixupPhase::speculateForBarrier): (JSC::DFG::FixupPhase::insertCheck): (JSC::DFG::FixupPhase::fixupChecksInBlock): * dfg/DFGSpeculativeJIT.cpp: Add support for nuking. (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): * ftl/FTLLowerDFGToB3.cpp: Add support for nuking. (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::nukeStructureAndSetButterfly): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): Deleted. * heap/CodeBlockSet.cpp: We need to be more careful about the CodeBlockSet workflow during GC, since we will allocate CodeBlocks in eden while collecting. (JSC::CodeBlockSet::clearMarksForFullCollection): (JSC::CodeBlockSet::deleteUnmarkedAndUnreferenced): * heap/Heap.cpp: Added code to measure max pauses. Added a better collectContinuously mode. (JSC::Heap::lastChanceToFinalize): Stop the collectContinuously thread. (JSC::Heap::harvestWeakReferences): Inline SlotVisitor::harvestWeakReferences. (JSC::Heap::finalizeUnconditionalFinalizers): Inline SlotVisitor::finalizeUnconditionalReferences. (JSC::Heap::markToFixpoint): We need to do some MarkedSpace stuff before every conservative scan, rather than just at the start of marking, so we now call prepareForConservativeScan() before each conservative scan. Also call a less-parallel version of drainInParallel when the mutator is running. (JSC::Heap::collectInThread): Inline Heap::prepareForAllocation(). (JSC::Heap::stopIfNecessarySlow): We need to be more careful about ensuring that we run finalization before and after stopping. Also, we should sanitize stack when stopping the world. (JSC::Heap::acquireAccessSlow): Add some optional debug prints. (JSC::Heap::handleNeedFinalize): Assert that we are running this when the world is not stopped. (JSC::Heap::finalize): Remove the old collectContinuously code. (JSC::Heap::requestCollection): We don't need to sanitize stack here anymore. (JSC::Heap::notifyIsSafeToCollect): Start the collectContinuously thread. It will request collection 1 KHz. (JSC::Heap::prepareForAllocation): Deleted. (JSC::Heap::preventCollection): Prevent any new concurrent GCs from being initiated. (JSC::Heap::allowCollection): (JSC::Heap::forEachSlotVisitor): Allows us to safely iterate slot visitors. * heap/Heap.h: * heap/HeapInlines.h: (JSC::Heap::writeBarrier): If the 'to' cell is not NewWhite then it could be AnthraciteOrBlack. During a full collection, objects may be AnthraciteOrBlack from a previous GC. Turns out, we don't benefit from this optimization so we can just kill it. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::buildSnapshot): This needs to use PreventCollectionScope to ensure snapshot soundness. * heap/ListableHandler.h: (JSC::ListableHandler::isOnList): Useful helper. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): It's a locker that only locks while we're marking. * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::addBlock): Hold the bitvector lock while resizing. * heap/MarkedBlock.cpp: Hold the bitvector lock while accessing the bitvectors while the mutator is running. * heap/MarkedSpace.cpp: (JSC::MarkedSpace::prepareForConservativeScan): We used to do this in prepareForMarking, but we need to do it before each conservative scan not just before marking. (JSC::MarkedSpace::prepareForMarking): Remove the logic moved to prepareForConservativeScan. * heap/MarkedSpace.h: * heap/PreventCollectionScope.h: Added. * heap/SlotVisitor.cpp: Refactored drainFromShared so that we can write a similar function called drainInParallelPassively. (JSC::SlotVisitor::updateMutatorIsStopped): Update whether we can use "fast" scanning. (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drain): This now uses the rightToRun lock to allow the main GC thread to safepoint the workers. (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): This runs marking with one fewer threads than normal. It's useful for when we have resumed the mutator, since then the mutator has a better chance of getting on a core. (JSC::SlotVisitor::addWeakReferenceHarvester): (JSC::SlotVisitor::addUnconditionalFinalizer): (JSC::SlotVisitor::harvestWeakReferences): Deleted. (JSC::SlotVisitor::finalizeUnconditionalFinalizers): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: Outline stuff. (JSC::SlotVisitor::addWeakReferenceHarvester): Deleted. (JSC::SlotVisitor::addUnconditionalFinalizer): Deleted. * runtime/InferredType.cpp: This needed thread safety. (JSC::InferredType::visitChildren): This needs to keep its structure finalizer alive until it runs. (JSC::InferredType::set): (JSC::InferredType::InferredStructureFinalizer::finalizeUnconditionally): * runtime/InferredType.h: * runtime/InferredValue.cpp: This needed thread safety. (JSC::InferredValue::visitChildren): (JSC::InferredValue::ValueCleanup::finalizeUnconditionally): * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): Update to use new butterfly API. (JSC::JSArray::unshiftCountWithArrayStorage): Update to use new butterfly API. * runtime/JSArrayBufferView.cpp: (JSC::JSArrayBufferView::visitChildren): Thread safety. * runtime/JSCell.h: (JSC::JSCell::setStructureIDDirectly): This is used for nuking the structure. (JSC::JSCell::InternalLocker::InternalLocker): Deleted. The cell is now the lock. (JSC::JSCell::InternalLocker::~InternalLocker): Deleted. The cell is now the lock. * runtime/JSCellInlines.h: (JSC::JSCell::structure): Clean this up. (JSC::JSCell::lock): The cell is now the lock. (JSC::JSCell::tryLock): (JSC::JSCell::unlock): (JSC::JSCell::isLocked): (JSC::JSCell::lockInternalLock): Deleted. (JSC::JSCell::unlockInternalLock): Deleted. * runtime/JSFunction.cpp: (JSC::JSFunction::visitChildren): Thread safety. * runtime/JSGenericTypedArrayViewInlines.h: (JSC::JSGenericTypedArrayView<Adaptor>::visitChildren): Thread safety. (JSC::JSGenericTypedArrayView<Adaptor>::slowDownAndWasteMemory): Thread safety. * runtime/JSObject.cpp: (JSC::JSObject::markAuxiliaryAndVisitOutOfLineProperties): Factor out this "easy" step of butterfly visiting. (JSC::JSObject::visitButterfly): Make this achieve 100% precision about structure-butterfly relationships. This relies on the mutator "nuking" the structure prior to "locked" structure-butterfly transitions. (JSC::JSObject::visitChildren): Use the new, nicer API. (JSC::JSFinalObject::visitChildren): Use the new, nicer API. (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): Use the new butterfly API. (JSC::JSObject::createInitialUndecided): Use the new butterfly API. (JSC::JSObject::createInitialInt32): Use the new butterfly API. (JSC::JSObject::createInitialDouble): Use the new butterfly API. (JSC::JSObject::createInitialContiguous): Use the new butterfly API. (JSC::JSObject::createArrayStorage): Use the new butterfly API. (JSC::JSObject::convertUndecidedToContiguous): Use the new butterfly API. (JSC::JSObject::convertUndecidedToArrayStorage): Use the new butterfly API. (JSC::JSObject::convertInt32ToArrayStorage): Use the new butterfly API. (JSC::JSObject::convertDoubleToContiguous): Use the new butterfly API. (JSC::JSObject::convertDoubleToArrayStorage): Use the new butterfly API. (JSC::JSObject::convertContiguousToArrayStorage): Use the new butterfly API. (JSC::JSObject::increaseVectorLength): Use the new butterfly API. (JSC::JSObject::shiftButterflyAfterFlattening): Use the new butterfly API. * runtime/JSObject.h: (JSC::JSObject::setButterfly): This now does all of the fences. Only use this when you are not also transitioning the structure or the structure's lastOffset. (JSC::JSObject::nukeStructureAndSetButterfly): Use this when doing locked structure-butterfly transitions. * runtime/JSObjectInlines.h: (JSC::JSObject::putDirectWithoutTransition): Use the newly factored out API. (JSC::JSObject::prepareToPutDirectWithoutTransition): Factor this out! (JSC::JSObject::putDirectInternal): Use the newly factored out API. * runtime/JSPropertyNameEnumerator.cpp: (JSC::JSPropertyNameEnumerator::finishCreation): Locks! (JSC::JSPropertyNameEnumerator::visitChildren): Locks! * runtime/JSSegmentedVariableObject.cpp: (JSC::JSSegmentedVariableObject::visitChildren): Locks! * runtime/JSString.cpp: (JSC::JSString::visitChildren): Thread safety. * runtime/ModuleProgramExecutable.cpp: (JSC::ModuleProgramExecutable::visitChildren): Thread safety. * runtime/Options.cpp: For now we disable concurrent GC on not-X86_64. (JSC::recomputeDependentOptions): * runtime/Options.h: Change the default max GC parallelism to 8. I don't know why it was still 7. * runtime/SamplingProfiler.cpp: (JSC::SamplingProfiler::stackTracesAsJSON): This needs to defer GC before grabbing its lock. * runtime/SparseArrayValueMap.cpp: This needed thread safety. (JSC::SparseArrayValueMap::add): (JSC::SparseArrayValueMap::remove): (JSC::SparseArrayValueMap::visitChildren): * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: This had a race between addNewPropertyTransition and visitChildren. (JSC::Structure::Structure): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::add): Help out with nuking support - the m_offset needs to play along. (JSC::Structure::visitChildren): * runtime/Structure.h: Make some useful things public - like the notion of a lastOffset. * runtime/StructureChain.cpp: (JSC::StructureChain::visitChildren): Thread safety! * runtime/StructureChain.h: Thread safety! * runtime/StructureIDTable.cpp: (JSC::StructureIDTable::allocateID): Ensure that we don't get nuked IDs. * runtime/StructureIDTable.h: Add the notion of a nuked ID! It's a bit that the runtime never sees except during specific shady actions like locked structure-butterfly transitions. "Nuking" tells the GC to steer clear and rescan once we fire the barrier. (JSC::nukedStructureIDBit): (JSC::nuke): (JSC::isNuked): (JSC::decontaminate): * runtime/StructureInlines.h: (JSC::Structure::hasIndexingHeader): Better API. (JSC::Structure::add): * runtime/VM.cpp: Better GC interaction. (JSC::VM::ensureWatchdog): (JSC::VM::deleteAllLinkedCode): (JSC::VM::deleteAllCode): * runtime/VM.h: (JSC::VM::getStructure): Why wasn't this always an API! * runtime/WebAssemblyExecutable.cpp: (JSC::WebAssemblyExecutable::visitChildren): Thread safety. Source/WebCore: Concurrent GC should be stable enough to land enabled on X86_64 https://bugs.webkit.org/show_bug.cgi?id=164990 Reviewed by Geoffrey Garen. Made WebCore down with concurrent marking by adding some locking and adapting to some new API. This has new test modes in run-sjc-stress-tests. Also, the way that LayoutTests run is already a fantastic GC test. * ForwardingHeaders/heap/DeleteAllCodeEffort.h: Added. * ForwardingHeaders/heap/LockDuringMarking.h: Added. * bindings/js/GCController.cpp: (WebCore::GCController::deleteAllCode): (WebCore::GCController::deleteAllLinkedCode): * bindings/js/GCController.h: * bindings/js/JSDOMBinding.cpp: (WebCore::getCachedDOMStructure): (WebCore::cacheDOMStructure): * bindings/js/JSDOMGlobalObject.cpp: (WebCore::JSDOMGlobalObject::addBuiltinGlobals): (WebCore::JSDOMGlobalObject::visitChildren): * bindings/js/JSDOMGlobalObject.h: (WebCore::getDOMConstructor): * bindings/js/JSDOMPromise.cpp: (WebCore::DeferredPromise::DeferredPromise): (WebCore::DeferredPromise::clear): * bindings/js/JSXPathResultCustom.cpp: (WebCore::JSXPathResult::visitAdditionalChildren): * dom/EventListenerMap.cpp: (WebCore::EventListenerMap::clear): (WebCore::EventListenerMap::replace): (WebCore::EventListenerMap::add): (WebCore::EventListenerMap::remove): (WebCore::EventListenerMap::find): (WebCore::EventListenerMap::removeFirstEventListenerCreatedFromMarkup): (WebCore::EventListenerMap::copyEventListenersNotCreatedFromMarkupToTarget): (WebCore::EventListenerIterator::EventListenerIterator): * dom/EventListenerMap.h: (WebCore::EventListenerMap::lock): * dom/EventTarget.cpp: (WebCore::EventTarget::visitJSEventListeners): * dom/EventTarget.h: (WebCore::EventTarget::visitJSEventListeners): Deleted. * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): * dom/Node.h: * page/MemoryRelease.cpp: (WebCore::releaseCriticalMemory): * page/cocoa/MemoryReleaseCocoa.mm: (WebCore::jettisonExpensiveObjectsOnTopLevelNavigation): (WebCore::registerMemoryReleaseNotifyCallbacks): Source/WTF: Concurrent GC should be stable enough to land enabled on X86_64 https://bugs.webkit.org/show_bug.cgi?id=164990 Reviewed by Geoffrey Garen. Adds the ability to say: auto locker = holdLock(any type of lock) Instead of having to say: Locker<LockType> locker(locks of type LockType) I think that we should use "auto locker = holdLock(lock)" as the default way that we acquire locks unless we need to use a special locker type. This also adds the ability to safepoint a lock. Safepointing a lock is basically a super fast way of unlocking it fairly and then immediately relocking it - i.e. letting anyone who is waiting to run without losing steam of there is noone waiting. * wtf/Lock.cpp: (WTF::LockBase::safepointSlow): * wtf/Lock.h: (WTF::LockBase::safepoint): * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::safepointFast): (WTF::LockAlgorithm::safepoint): (WTF::LockAlgorithm::safepointSlow): * wtf/Locker.h: (WTF::AbstractLocker::AbstractLocker): (WTF::Locker::tryLock): (WTF::Locker::operator bool): (WTF::Locker::Locker): (WTF::Locker::operator=): (WTF::holdLock): (WTF::tryHoldLock): Tools: Concurrent GC should be stable enough to land enabled https://bugs.webkit.org/show_bug.cgi?id=164990 Reviewed by Geoffrey Garen. Add a new mode that runs GC continuously. Also made eager modes run GC continuously. It's clear that this works just fine in release, but I'm still trying to figure out if it's safe for debug. It might be too slow for debug. * Scripts/run-jsc-stress-tests: Canonical link: https://commits.webkit.org/183229@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@209570 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-12-08 22:14:50 +00:00
WTF_EXPORT_PRIVATE void safepointSlow();
Lightweight locks should be adaptive https://bugs.webkit.org/show_bug.cgi?id=147545 Reviewed by Geoffrey Garen. Source/JavaScriptCore: * dfg/DFGCommon.cpp: (JSC::DFG::startCrashing): * heap/CopiedBlock.h: (JSC::CopiedBlock::workListLock): * heap/CopiedBlockInlines.h: (JSC::CopiedBlock::shouldReportLiveBytes): (JSC::CopiedBlock::reportLiveBytes): * heap/CopiedSpace.cpp: (JSC::CopiedSpace::doneFillingBlock): * heap/CopiedSpace.h: (JSC::CopiedSpace::CopiedGeneration::CopiedGeneration): * heap/CopiedSpaceInlines.h: (JSC::CopiedSpace::recycleEvacuatedBlock): * heap/GCThreadSharedData.cpp: (JSC::GCThreadSharedData::didStartCopying): * heap/GCThreadSharedData.h: (JSC::GCThreadSharedData::getNextBlocksToCopy): * heap/ListableHandler.h: (JSC::ListableHandler::List::addThreadSafe): (JSC::ListableHandler::List::addNotThreadSafe): * heap/MachineStackMarker.cpp: (JSC::MachineThreads::tryCopyOtherThreadStacks): * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::copyLater): * parser/SourceProvider.cpp: (JSC::SourceProvider::~SourceProvider): (JSC::SourceProvider::getID): * profiler/ProfilerDatabase.cpp: (JSC::Profiler::Database::addDatabaseToAtExit): (JSC::Profiler::Database::removeDatabaseFromAtExit): (JSC::Profiler::Database::removeFirstAtExitDatabase): * runtime/TypeProfilerLog.h: Source/WebCore: * bindings/objc/WebScriptObject.mm: (WebCore::getJSWrapper): (WebCore::addJSWrapper): (WebCore::removeJSWrapper): (WebCore::removeJSWrapperIfRetainCountOne): * platform/audio/mac/CARingBuffer.cpp: (WebCore::CARingBuffer::setCurrentFrameBounds): (WebCore::CARingBuffer::getCurrentFrameBounds): * platform/audio/mac/CARingBuffer.h: * platform/ios/wak/WAKWindow.mm: (-[WAKWindow setExposedScrollViewRect:]): (-[WAKWindow exposedScrollViewRect]): Source/WebKit2: * WebProcess/WebPage/EventDispatcher.cpp: (WebKit::EventDispatcher::clearQueuedTouchEventsForPage): (WebKit::EventDispatcher::getQueuedTouchEventsForPage): (WebKit::EventDispatcher::touchEvent): (WebKit::EventDispatcher::dispatchTouchEvents): * WebProcess/WebPage/EventDispatcher.h: * WebProcess/WebPage/ViewUpdateDispatcher.cpp: (WebKit::ViewUpdateDispatcher::visibleContentRectUpdate): (WebKit::ViewUpdateDispatcher::dispatchVisibleContentRectUpdate): * WebProcess/WebPage/ViewUpdateDispatcher.h: Source/WTF: A common idiom in WebKit is to use spinlocks. We use them because the lock acquisition overhead is lower than system locks and because they take dramatically less space than system locks. The speed and space advantages of spinlocks can be astonishing: an uncontended spinlock acquire is up to 10x faster and under microcontention - short critical section with two or more threads taking turns - spinlocks are up to 100x faster. Spinlocks take only 1 byte or 4 bytes depending on the flavor, while system locks take 64 bytes or more. Clearly, WebKit should continue to avoid system locks - they are just far too slow and far too big. But there is a problem with this idiom. System lock implementations will sleep a thread when it attempts to acquire a lock that is held, while spinlocks will cause the thread to burn CPU. In WebKit spinlocks, the thread will repeatedly call sched_yield(). This is awesome for microcontention, but awful when the lock will not be released for a while. In fact, when critical sections take tens of microseconds or more, the CPU time cost of our spinlocks is almost 100x more than the CPU time cost of a system lock. This case doesn't arise too frequently in our current uses of spinlocks, but that's probably because right now there are places where we make a conscious decision to use system locks - even though they use more memory and are slower - because we don't want to waste CPU cycles when a thread has to wait a while to acquire the lock. The solution is to just implement a modern adaptive mutex in WTF. Luckily, this isn't a new concept. This patch implements a mutex that is reminiscent of the kinds of low-overhead locks that JVMs use. The actual implementation here is inspired by some of the ideas from [1]. The idea is simple: the fast path is an inlined CAS to immediately acquire a lock that isn't held, the slow path tries some number of spins to acquire the lock, and if that fails, the thread is put on a queue and put to sleep. The queue is made up of statically allocated thread nodes and the lock itself is a tagged pointer: either it is just bits telling us the complete lock state (not held or held) or it is a pointer to the head of a queue of threads waiting to acquire the lock. This approach gives WTF::Lock three different levels of adaptation: an inlined fast path if the lock is not contended, a short burst of spinning for microcontention, and a full-blown queue for critical sections that are held for a long time. On a locking microbenchmark, this new Lock exhibits the following performance characteristics: - Lock+unlock on an uncontended no-op critical section: 2x slower than SpinLock and 3x faster than a system mutex. - Lock+unlock on a contended no-op critical section: 2x slower than SpinLock and 100x faster than a system mutex. - CPU time spent in lock() on a lock held for a while: same as system mutex, 90x less than a SpinLock. - Memory usage: sizeof(void*), so on 64-bit it's 8x less than a system mutex but 2x worse than a SpinLock. This patch replaces all uses of SpinLock with Lock, since our critical sections are not no-ops so if you do basically anything in your critical section, the Lock overhead will be invisible. Also, in all places where we used SpinLock, we could tolerate 8 bytes of overhead instead of 4. Performance benchmarking using JSC macrobenchmarks shows no difference, which is as it should be: the purpose of this change is to reduce CPU time wasted, not wallclock time. This patch doesn't replace any uses of ByteSpinLock, since we expect that the space benefits of having a lock that just uses a byte are still better than the CPU wastage benefits of Lock. But, this work will enable some future work to create locks that will fit in just 1.6 bits: https://bugs.webkit.org/show_bug.cgi?id=147665. Rolling this back in after fixing Lock::unlockSlow() for architectures that have a truly weak CAS. Since the Lock::unlock() fast path can go to slow path spuriously, it may go there even if there aren't any threads on the Lock's queue. So, unlockSlow() must be able to deal with the possibility of a null queue head. [1] http://www.filpizlo.com/papers/pizlo-pppj2011-fable.pdf * WTF.vcxproj/WTF.vcxproj: * WTF.xcodeproj/project.pbxproj: * benchmarks: Added. * benchmarks/LockSpeedTest.cpp: Added. (main): * wtf/Atomics.h: (WTF::Atomic::compareExchangeWeak): (WTF::Atomic::compareExchangeStrong): * wtf/CMakeLists.txt: * wtf/Lock.cpp: Added. (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): * wtf/Lock.h: Added. (WTF::LockBase::lock): (WTF::LockBase::unlock): (WTF::LockBase::isHeld): (WTF::LockBase::isLocked): (WTF::Lock::Lock): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocatorHandle::shrink): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): (WTF::MetaAllocator::addFreshFreeSpace): (WTF::MetaAllocator::debugFreeSpaceSize): * wtf/MetaAllocator.h: * wtf/SpinLock.h: * wtf/ThreadingPthreads.cpp: * wtf/ThreadingWin.cpp: * wtf/text/AtomicString.cpp: * wtf/text/AtomicStringImpl.cpp: (WTF::AtomicStringTableLocker::AtomicStringTableLocker): Tools: * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.vcxproj/TestWebKitAPI.vcxproj: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Lock.cpp: Added. (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/165908@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188169 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-07 22:38:59 +00:00
WTF::Lock should not suffer from the thundering herd https://bugs.webkit.org/show_bug.cgi?id=147947 Reviewed by Geoffrey Garen. Source/WTF: This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with doing this is that it's not obvious after calling unparkOne() if there are any other threads that are still parked on the lock's queue. If we assume that there are and leave the hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that if there aren't actually any threads parked. On the other hand, if we assume that there aren't any threads parked and clear the hasParkedBit, then if there actually were some threads parked, then they may never be awoken since future calls to unlock() won't take slow path and so won't call unparkOne(). In other words, we need a way to be very precise about when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case that we clear the bit just as some thread gets parked on the queue. A similar problem arises in futexes, and one of the solutions is to have a thread that acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked thread runs, then that barging thread will not know that there are threads parked. This could increase the severity of barging. Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security issues and so we can expose callbacks while ParkingLot is holding its internal locks. This change does exactly that for unparkOne(). The new variant of unparkOne() will call a user function while the queue from which we are unparking is locked. The callback is told basic stats about the queue: did we unpark a thread this time, and could there be more threads to unpark in the future. The callback runs while it's impossible for the queue state to change, since the ParkingLot's internal locks for the queue is held. This means that Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock inside the callback from unparkOne(). This takes care of the thundering herd problem while also reducing the greed that arises from barging threads. This required some careful reworking of the ParkingLot algorithm. The first thing I noticed was that the ThreadData::shouldPark flag was useless, since it's set exactly when ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create both hashtables and buckets, since the "callback is called while queue is locked" invariant requires that we didn't exit early due to the hashtable or bucket not being present. Note that all of this is done in such a way that the old unparkOne() and unparkAll() don't have to create any buckets, though they now may create the hashtable. We don't care as much about the hashtable being created by unpark since it's just such an unlikely scenario and it would only happen once. This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that benchmark. * benchmarks/LockSpeedTest.cpp: * wtf/Lock.cpp: (WTF::LockBase::unlockSlow): * wtf/Lock.h: (WTF::LockBase::isLocked): (WTF::LockBase::isFullyReset): * wtf/ParkingLot.cpp: (WTF::ParkingLot::parkConditionally): (WTF::ParkingLot::unparkOne): (WTF::ParkingLot::unparkAll): * wtf/ParkingLot.h: * wtf/WordLock.h: (WTF::WordLock::isLocked): (WTF::WordLock::isFullyReset): Tools: Add testing that checks that locks return to a pristine state after contention is over. * TestWebKitAPI/Tests/WTF/Lock.cpp: (TestWebKitAPI::LockInspector::isFullyReset): (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/166072@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
// Method used for testing only.
bool isFullyReset() const
{
return !m_byte.load();
}
[WTF] Remove XXXLockBase since constexpr constructor can initialize static variables without calling global constructors https://bugs.webkit.org/show_bug.cgi?id=180495 Reviewed by Mark Lam. Very nice feature of C++11 is that constexpr constructor can initialize static global variables without calling global constructors. We do not need to have XXXLockBase with derived XXXLock class since StaticXXXLock can have constructors as long as it is constexpr. We remove bunch of these classes, and set `XXXLock() = default;` explicitly for readability. C++11's default constructor is constexpr as long as its member's default constructor / default initializer is constexpr. * wtf/Condition.h: (WTF::ConditionBase::construct): Deleted. (WTF::ConditionBase::waitUntil): Deleted. (WTF::ConditionBase::waitFor): Deleted. (WTF::ConditionBase::wait): Deleted. (WTF::ConditionBase::notifyOne): Deleted. (WTF::ConditionBase::notifyAll): Deleted. (WTF::Condition::Condition): Deleted. * wtf/CountingLock.h: (WTF::CountingLock::CountingLock): Deleted. (WTF::CountingLock::~CountingLock): Deleted. * wtf/Lock.cpp: (WTF::Lock::lockSlow): (WTF::Lock::unlockSlow): (WTF::Lock::unlockFairlySlow): (WTF::Lock::safepointSlow): (WTF::LockBase::lockSlow): Deleted. (WTF::LockBase::unlockSlow): Deleted. (WTF::LockBase::unlockFairlySlow): Deleted. (WTF::LockBase::safepointSlow): Deleted. * wtf/Lock.h: (WTF::LockBase::construct): Deleted. (WTF::LockBase::lock): Deleted. (WTF::LockBase::tryLock): Deleted. (WTF::LockBase::try_lock): Deleted. (WTF::LockBase::unlock): Deleted. (WTF::LockBase::unlockFairly): Deleted. (WTF::LockBase::safepoint): Deleted. (WTF::LockBase::isHeld const): Deleted. (WTF::LockBase::isLocked const): Deleted. (WTF::LockBase::isFullyReset const): Deleted. (WTF::Lock::Lock): Deleted. * wtf/ReadWriteLock.cpp: (WTF::ReadWriteLock::readLock): (WTF::ReadWriteLock::readUnlock): (WTF::ReadWriteLock::writeLock): (WTF::ReadWriteLock::writeUnlock): (WTF::ReadWriteLockBase::construct): Deleted. (WTF::ReadWriteLockBase::readLock): Deleted. (WTF::ReadWriteLockBase::readUnlock): Deleted. (WTF::ReadWriteLockBase::writeLock): Deleted. (WTF::ReadWriteLockBase::writeUnlock): Deleted. * wtf/ReadWriteLock.h: (WTF::ReadWriteLock::read): (WTF::ReadWriteLock::write): (WTF::ReadWriteLockBase::ReadLock::tryLock): Deleted. (WTF::ReadWriteLockBase::ReadLock::lock): Deleted. (WTF::ReadWriteLockBase::ReadLock::unlock): Deleted. (WTF::ReadWriteLockBase::WriteLock::tryLock): Deleted. (WTF::ReadWriteLockBase::WriteLock::lock): Deleted. (WTF::ReadWriteLockBase::WriteLock::unlock): Deleted. (WTF::ReadWriteLockBase::read): Deleted. (WTF::ReadWriteLockBase::write): Deleted. (WTF::ReadWriteLock::ReadWriteLock): Deleted. * wtf/RecursiveLockAdapter.h: (WTF::RecursiveLockAdapter::RecursiveLockAdapter): Deleted. * wtf/WordLock.cpp: (WTF::WordLock::lockSlow): (WTF::WordLock::unlockSlow): (WTF::WordLockBase::lockSlow): Deleted. (WTF::WordLockBase::unlockSlow): Deleted. * wtf/WordLock.h: (WTF::WordLockBase::lock): Deleted. (WTF::WordLockBase::unlock): Deleted. (WTF::WordLockBase::isHeld const): Deleted. (WTF::WordLockBase::isLocked const): Deleted. (WTF::WordLockBase::isFullyReset const): Deleted. (WTF::WordLock::WordLock): Deleted. * wtf/WorkQueue.cpp: Canonical link: https://commits.webkit.org/196438@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225617 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-07 03:52:09 +00:00
Atomic<uint8_t> m_byte { 0 };
};
Lightweight locks should be adaptive https://bugs.webkit.org/show_bug.cgi?id=147545 Reviewed by Geoffrey Garen. Source/JavaScriptCore: * dfg/DFGCommon.cpp: (JSC::DFG::startCrashing): * heap/CopiedBlock.h: (JSC::CopiedBlock::workListLock): * heap/CopiedBlockInlines.h: (JSC::CopiedBlock::shouldReportLiveBytes): (JSC::CopiedBlock::reportLiveBytes): * heap/CopiedSpace.cpp: (JSC::CopiedSpace::doneFillingBlock): * heap/CopiedSpace.h: (JSC::CopiedSpace::CopiedGeneration::CopiedGeneration): * heap/CopiedSpaceInlines.h: (JSC::CopiedSpace::recycleEvacuatedBlock): * heap/GCThreadSharedData.cpp: (JSC::GCThreadSharedData::didStartCopying): * heap/GCThreadSharedData.h: (JSC::GCThreadSharedData::getNextBlocksToCopy): * heap/ListableHandler.h: (JSC::ListableHandler::List::addThreadSafe): (JSC::ListableHandler::List::addNotThreadSafe): * heap/MachineStackMarker.cpp: (JSC::MachineThreads::tryCopyOtherThreadStacks): * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::copyLater): * parser/SourceProvider.cpp: (JSC::SourceProvider::~SourceProvider): (JSC::SourceProvider::getID): * profiler/ProfilerDatabase.cpp: (JSC::Profiler::Database::addDatabaseToAtExit): (JSC::Profiler::Database::removeDatabaseFromAtExit): (JSC::Profiler::Database::removeFirstAtExitDatabase): * runtime/TypeProfilerLog.h: Source/WebCore: * bindings/objc/WebScriptObject.mm: (WebCore::getJSWrapper): (WebCore::addJSWrapper): (WebCore::removeJSWrapper): (WebCore::removeJSWrapperIfRetainCountOne): * platform/audio/mac/CARingBuffer.cpp: (WebCore::CARingBuffer::setCurrentFrameBounds): (WebCore::CARingBuffer::getCurrentFrameBounds): * platform/audio/mac/CARingBuffer.h: * platform/ios/wak/WAKWindow.mm: (-[WAKWindow setExposedScrollViewRect:]): (-[WAKWindow exposedScrollViewRect]): Source/WebKit2: * WebProcess/WebPage/EventDispatcher.cpp: (WebKit::EventDispatcher::clearQueuedTouchEventsForPage): (WebKit::EventDispatcher::getQueuedTouchEventsForPage): (WebKit::EventDispatcher::touchEvent): (WebKit::EventDispatcher::dispatchTouchEvents): * WebProcess/WebPage/EventDispatcher.h: * WebProcess/WebPage/ViewUpdateDispatcher.cpp: (WebKit::ViewUpdateDispatcher::visibleContentRectUpdate): (WebKit::ViewUpdateDispatcher::dispatchVisibleContentRectUpdate): * WebProcess/WebPage/ViewUpdateDispatcher.h: Source/WTF: A common idiom in WebKit is to use spinlocks. We use them because the lock acquisition overhead is lower than system locks and because they take dramatically less space than system locks. The speed and space advantages of spinlocks can be astonishing: an uncontended spinlock acquire is up to 10x faster and under microcontention - short critical section with two or more threads taking turns - spinlocks are up to 100x faster. Spinlocks take only 1 byte or 4 bytes depending on the flavor, while system locks take 64 bytes or more. Clearly, WebKit should continue to avoid system locks - they are just far too slow and far too big. But there is a problem with this idiom. System lock implementations will sleep a thread when it attempts to acquire a lock that is held, while spinlocks will cause the thread to burn CPU. In WebKit spinlocks, the thread will repeatedly call sched_yield(). This is awesome for microcontention, but awful when the lock will not be released for a while. In fact, when critical sections take tens of microseconds or more, the CPU time cost of our spinlocks is almost 100x more than the CPU time cost of a system lock. This case doesn't arise too frequently in our current uses of spinlocks, but that's probably because right now there are places where we make a conscious decision to use system locks - even though they use more memory and are slower - because we don't want to waste CPU cycles when a thread has to wait a while to acquire the lock. The solution is to just implement a modern adaptive mutex in WTF. Luckily, this isn't a new concept. This patch implements a mutex that is reminiscent of the kinds of low-overhead locks that JVMs use. The actual implementation here is inspired by some of the ideas from [1]. The idea is simple: the fast path is an inlined CAS to immediately acquire a lock that isn't held, the slow path tries some number of spins to acquire the lock, and if that fails, the thread is put on a queue and put to sleep. The queue is made up of statically allocated thread nodes and the lock itself is a tagged pointer: either it is just bits telling us the complete lock state (not held or held) or it is a pointer to the head of a queue of threads waiting to acquire the lock. This approach gives WTF::Lock three different levels of adaptation: an inlined fast path if the lock is not contended, a short burst of spinning for microcontention, and a full-blown queue for critical sections that are held for a long time. On a locking microbenchmark, this new Lock exhibits the following performance characteristics: - Lock+unlock on an uncontended no-op critical section: 2x slower than SpinLock and 3x faster than a system mutex. - Lock+unlock on a contended no-op critical section: 2x slower than SpinLock and 100x faster than a system mutex. - CPU time spent in lock() on a lock held for a while: same as system mutex, 90x less than a SpinLock. - Memory usage: sizeof(void*), so on 64-bit it's 8x less than a system mutex but 2x worse than a SpinLock. This patch replaces all uses of SpinLock with Lock, since our critical sections are not no-ops so if you do basically anything in your critical section, the Lock overhead will be invisible. Also, in all places where we used SpinLock, we could tolerate 8 bytes of overhead instead of 4. Performance benchmarking using JSC macrobenchmarks shows no difference, which is as it should be: the purpose of this change is to reduce CPU time wasted, not wallclock time. This patch doesn't replace any uses of ByteSpinLock, since we expect that the space benefits of having a lock that just uses a byte are still better than the CPU wastage benefits of Lock. But, this work will enable some future work to create locks that will fit in just 1.6 bits: https://bugs.webkit.org/show_bug.cgi?id=147665. Rolling this back in after fixing Lock::unlockSlow() for architectures that have a truly weak CAS. Since the Lock::unlock() fast path can go to slow path spuriously, it may go there even if there aren't any threads on the Lock's queue. So, unlockSlow() must be able to deal with the possibility of a null queue head. [1] http://www.filpizlo.com/papers/pizlo-pppj2011-fable.pdf * WTF.vcxproj/WTF.vcxproj: * WTF.xcodeproj/project.pbxproj: * benchmarks: Added. * benchmarks/LockSpeedTest.cpp: Added. (main): * wtf/Atomics.h: (WTF::Atomic::compareExchangeWeak): (WTF::Atomic::compareExchangeStrong): * wtf/CMakeLists.txt: * wtf/Lock.cpp: Added. (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): * wtf/Lock.h: Added. (WTF::LockBase::lock): (WTF::LockBase::unlock): (WTF::LockBase::isHeld): (WTF::LockBase::isLocked): (WTF::Lock::Lock): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocatorHandle::shrink): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): (WTF::MetaAllocator::addFreshFreeSpace): (WTF::MetaAllocator::debugFreeSpaceSize): * wtf/MetaAllocator.h: * wtf/SpinLock.h: * wtf/ThreadingPthreads.cpp: * wtf/ThreadingWin.cpp: * wtf/text/AtomicString.cpp: * wtf/text/AtomicStringImpl.cpp: (WTF::AtomicStringTableLocker::AtomicStringTableLocker): Tools: * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.vcxproj/TestWebKitAPI.vcxproj: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Lock.cpp: Added. (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/165908@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188169 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-07 22:38:59 +00:00
Make CheckedLock the default Lock https://bugs.webkit.org/show_bug.cgi?id=226157 Reviewed by Darin Adler. Make CheckedLock the default Lock so that we get more benefits from Clang Thread Safety Analysis. Note that CheckedLock 100% relies on the existing Source/JavaScriptCore: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * assembler/testmasm.cpp: * dfg/DFGCommon.cpp: * dfg/DFGThreadData.h: * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::Worklist): * dfg/DFGWorklist.h: * dynbench.cpp: * heap/BlockDirectory.h: (JSC::BlockDirectory::bitvectorLock): * heap/CodeBlockSet.h: (JSC::CodeBlockSet::getLock): * heap/Heap.cpp: (JSC::Heap::Heap): * heap/Heap.h: * heap/MarkedSpace.h: (JSC::MarkedSpace::directoryLock): * heap/MarkingConstraintSolver.h: * heap/SlotVisitor.cpp: (JSC::SlotVisitor::donateKnownParallel): * heap/SlotVisitor.h: * jit/ExecutableAllocator.cpp: (JSC::ExecutableAllocator::getLock const): (JSC::dumpJITMemory): * jit/ExecutableAllocator.h: (JSC::ExecutableAllocatorBase::getLock const): * jit/JITWorklist.cpp: (JSC::JITWorklist::JITWorklist): * jit/JITWorklist.h: * jsc.cpp: * profiler/ProfilerDatabase.h: * runtime/ConcurrentJSLock.h: * runtime/DeferredWorkTimer.h: * runtime/JSLock.h: * runtime/SamplingProfiler.cpp: (JSC::FrameWalker::FrameWalker): (JSC::CFrameWalker::CFrameWalker): (JSC::SamplingProfiler::takeSample): * runtime/SamplingProfiler.h: (JSC::SamplingProfiler::getLock): * runtime/VM.h: * runtime/VMTraps.cpp: (JSC::VMTraps::invalidateCodeBlocksOnStack): (JSC::VMTraps::VMTraps): * runtime/VMTraps.h: * tools/FunctionOverrides.h: * tools/VMInspector.cpp: (JSC::ensureIsSafeToLock): * tools/VMInspector.h: (JSC::VMInspector::getLock): * wasm/WasmCalleeRegistry.h: (JSC::Wasm::CalleeRegistry::getLock): * wasm/WasmPlan.h: * wasm/WasmStreamingCompiler.h: * wasm/WasmThunks.h: * wasm/WasmWorklist.cpp: (JSC::Wasm::Worklist::Worklist): * wasm/WasmWorklist.h: Source/WebCore: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * Modules/indexeddb/server/IDBServer.cpp: * Modules/webaudio/MediaElementAudioSourceNode.h: * Modules/webdatabase/OriginLock.cpp: * bindings/js/JSDOMGlobalObject.h: * dom/Node.cpp: * html/HTMLMediaElement.cpp: (WebCore::HTMLMediaElement::createMediaPlayer): * html/canvas/WebGLContextGroup.cpp: (WebCore::WebGLContextGroup::objectGraphLockForAContext): * html/canvas/WebGLContextGroup.h: * html/canvas/WebGLContextObject.cpp: (WebCore::WebGLContextObject::objectGraphLockForContext): * html/canvas/WebGLContextObject.h: * html/canvas/WebGLObject.h: * html/canvas/WebGLRenderingContextBase.cpp: (WebCore::WebGLRenderingContextBase::objectGraphLock): * html/canvas/WebGLRenderingContextBase.h: * html/canvas/WebGLSharedObject.cpp: (WebCore::WebGLSharedObject::objectGraphLockForContext): * html/canvas/WebGLSharedObject.h: * page/scrolling/mac/ScrollingTreeMac.h: * platform/audio/ReverbConvolver.cpp: (WebCore::ReverbConvolver::backgroundThreadEntry): * platform/graphics/ShadowBlur.cpp: (WebCore::ScratchBuffer::lock): (WebCore::ShadowBlur::drawRectShadowWithTiling): (WebCore::ShadowBlur::drawInsetShadowWithTiling): * platform/graphics/gstreamer/VideoSinkGStreamer.cpp: * platform/graphics/gstreamer/eme/WebKitCommonEncryptionDecryptorGStreamer.cpp: Source/WebKit: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * GPUProcess/graphics/RemoteGraphicsContextGL.cpp: (WebKit::RemoteGraphicsContextGL::paintPixelBufferToImageBuffer): * NetworkProcess/IndexedDB/WebIDBServer.cpp: * UIProcess/API/glib/IconDatabase.h: * UIProcess/mac/WKPrintingView.mm: (-[WKPrintingView knowsPageRange:]): Source/WTF: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * wtf/AutomaticThread.cpp: (WTF::AutomaticThreadCondition::wait): (WTF::AutomaticThreadCondition::waitFor): (WTF::AutomaticThread::AutomaticThread): * wtf/AutomaticThread.h: * wtf/CheckedCondition.h: * wtf/CheckedLock.h: * wtf/Condition.h: * wtf/Lock.cpp: (WTF::UncheckedLock::lockSlow): (WTF::UncheckedLock::unlockSlow): (WTF::UncheckedLock::unlockFairlySlow): (WTF::UncheckedLock::safepointSlow): * wtf/Lock.h: (WTF::WTF_ASSERTS_ACQUIRED_LOCK): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocator::MetaAllocator): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): * wtf/MetaAllocator.h: * wtf/ParallelHelperPool.cpp: (WTF::ParallelHelperPool::ParallelHelperPool): * wtf/ParallelHelperPool.h: * wtf/RecursiveLockAdapter.h: * wtf/WorkerPool.cpp: (WTF::WorkerPool::WorkerPool): * wtf/WorkerPool.h: Tools: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * TestWebKitAPI/Tests/WTF/CheckedConditionTest.cpp: * TestWebKitAPI/Tests/WTF/Condition.cpp: * TestWebKitAPI/Tests/WTF/MetaAllocator.cpp: * WebKitTestRunner/InjectedBundle/AccessibilityController.cpp: (WTR::AXThread::createThreadIfNeeded): Canonical link: https://commits.webkit.org/238070@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@277943 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2021-05-24 05:37:41 +00:00
// Asserts that the lock is held.
// This can be used in cases where the annotations cannot be added to the function
// declaration.
inline void assertIsHeld(const Lock& lock) WTF_ASSERTS_ACQUIRED_LOCK(lock) { ASSERT_UNUSED(lock, lock.isHeld()); }
// Locker specialization to use with Lock.
// Non-movable simple scoped lock holder.
// Example: Locker locker { m_lock };
template <>
Stop using UncheckedLock in html/canvas https://bugs.webkit.org/show_bug.cgi?id=226186 Reviewed by Darin Adler. Source/WebCore: Stop using UncheckedLock in html/canvas. This is a step towards phasing out UncheckedLock, in favor of the checked Lock. Technically, the code still doesn't do much thread-safety analysis after this change. It is very difficult to adopt thread-safety analysis here because the call sites don't always lock (there are cases where no locking is needed). It is also hard to get a reference to the various locks to make WTF_REQUIRES_LOCK() work. * html/canvas/WebGL2RenderingContext.cpp: (WebCore::WebGL2RenderingContext::deleteVertexArray): * html/canvas/WebGLContextGroup.cpp: (WebCore::WebGLContextGroup::objectGraphLockForAContext): * html/canvas/WebGLContextGroup.h: * html/canvas/WebGLContextObject.cpp: (WebCore::WebGLContextObject::objectGraphLockForContext): * html/canvas/WebGLContextObject.h: * html/canvas/WebGLObject.h: * html/canvas/WebGLRenderingContextBase.cpp: (WebCore::WebGLRenderingContextBase::objectGraphLock): * html/canvas/WebGLRenderingContextBase.h: * html/canvas/WebGLSharedObject.cpp: (WebCore::WebGLSharedObject::objectGraphLockForContext): * html/canvas/WebGLSharedObject.h: Source/WTF: Allow converting a Locker<Lock> to an AbstractLocker type. This allows porting code from UncheckedLock to Lock even if said code is passing the Locker<Lock> as parameter around as an AbstractLocker. This is very common in JSC and in html/canvas. Also make DropLockForScope work with Locker<Lock> to help port code over. * wtf/Lock.h: Canonical link: https://commits.webkit.org/238141@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@278057 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2021-05-25 23:19:19 +00:00
class WTF_CAPABILITY_SCOPED_LOCK Locker<Lock> : public AbstractLocker {
Make CheckedLock the default Lock https://bugs.webkit.org/show_bug.cgi?id=226157 Reviewed by Darin Adler. Make CheckedLock the default Lock so that we get more benefits from Clang Thread Safety Analysis. Note that CheckedLock 100% relies on the existing Source/JavaScriptCore: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * assembler/testmasm.cpp: * dfg/DFGCommon.cpp: * dfg/DFGThreadData.h: * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::Worklist): * dfg/DFGWorklist.h: * dynbench.cpp: * heap/BlockDirectory.h: (JSC::BlockDirectory::bitvectorLock): * heap/CodeBlockSet.h: (JSC::CodeBlockSet::getLock): * heap/Heap.cpp: (JSC::Heap::Heap): * heap/Heap.h: * heap/MarkedSpace.h: (JSC::MarkedSpace::directoryLock): * heap/MarkingConstraintSolver.h: * heap/SlotVisitor.cpp: (JSC::SlotVisitor::donateKnownParallel): * heap/SlotVisitor.h: * jit/ExecutableAllocator.cpp: (JSC::ExecutableAllocator::getLock const): (JSC::dumpJITMemory): * jit/ExecutableAllocator.h: (JSC::ExecutableAllocatorBase::getLock const): * jit/JITWorklist.cpp: (JSC::JITWorklist::JITWorklist): * jit/JITWorklist.h: * jsc.cpp: * profiler/ProfilerDatabase.h: * runtime/ConcurrentJSLock.h: * runtime/DeferredWorkTimer.h: * runtime/JSLock.h: * runtime/SamplingProfiler.cpp: (JSC::FrameWalker::FrameWalker): (JSC::CFrameWalker::CFrameWalker): (JSC::SamplingProfiler::takeSample): * runtime/SamplingProfiler.h: (JSC::SamplingProfiler::getLock): * runtime/VM.h: * runtime/VMTraps.cpp: (JSC::VMTraps::invalidateCodeBlocksOnStack): (JSC::VMTraps::VMTraps): * runtime/VMTraps.h: * tools/FunctionOverrides.h: * tools/VMInspector.cpp: (JSC::ensureIsSafeToLock): * tools/VMInspector.h: (JSC::VMInspector::getLock): * wasm/WasmCalleeRegistry.h: (JSC::Wasm::CalleeRegistry::getLock): * wasm/WasmPlan.h: * wasm/WasmStreamingCompiler.h: * wasm/WasmThunks.h: * wasm/WasmWorklist.cpp: (JSC::Wasm::Worklist::Worklist): * wasm/WasmWorklist.h: Source/WebCore: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * Modules/indexeddb/server/IDBServer.cpp: * Modules/webaudio/MediaElementAudioSourceNode.h: * Modules/webdatabase/OriginLock.cpp: * bindings/js/JSDOMGlobalObject.h: * dom/Node.cpp: * html/HTMLMediaElement.cpp: (WebCore::HTMLMediaElement::createMediaPlayer): * html/canvas/WebGLContextGroup.cpp: (WebCore::WebGLContextGroup::objectGraphLockForAContext): * html/canvas/WebGLContextGroup.h: * html/canvas/WebGLContextObject.cpp: (WebCore::WebGLContextObject::objectGraphLockForContext): * html/canvas/WebGLContextObject.h: * html/canvas/WebGLObject.h: * html/canvas/WebGLRenderingContextBase.cpp: (WebCore::WebGLRenderingContextBase::objectGraphLock): * html/canvas/WebGLRenderingContextBase.h: * html/canvas/WebGLSharedObject.cpp: (WebCore::WebGLSharedObject::objectGraphLockForContext): * html/canvas/WebGLSharedObject.h: * page/scrolling/mac/ScrollingTreeMac.h: * platform/audio/ReverbConvolver.cpp: (WebCore::ReverbConvolver::backgroundThreadEntry): * platform/graphics/ShadowBlur.cpp: (WebCore::ScratchBuffer::lock): (WebCore::ShadowBlur::drawRectShadowWithTiling): (WebCore::ShadowBlur::drawInsetShadowWithTiling): * platform/graphics/gstreamer/VideoSinkGStreamer.cpp: * platform/graphics/gstreamer/eme/WebKitCommonEncryptionDecryptorGStreamer.cpp: Source/WebKit: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * GPUProcess/graphics/RemoteGraphicsContextGL.cpp: (WebKit::RemoteGraphicsContextGL::paintPixelBufferToImageBuffer): * NetworkProcess/IndexedDB/WebIDBServer.cpp: * UIProcess/API/glib/IconDatabase.h: * UIProcess/mac/WKPrintingView.mm: (-[WKPrintingView knowsPageRange:]): Source/WTF: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * wtf/AutomaticThread.cpp: (WTF::AutomaticThreadCondition::wait): (WTF::AutomaticThreadCondition::waitFor): (WTF::AutomaticThread::AutomaticThread): * wtf/AutomaticThread.h: * wtf/CheckedCondition.h: * wtf/CheckedLock.h: * wtf/Condition.h: * wtf/Lock.cpp: (WTF::UncheckedLock::lockSlow): (WTF::UncheckedLock::unlockSlow): (WTF::UncheckedLock::unlockFairlySlow): (WTF::UncheckedLock::safepointSlow): * wtf/Lock.h: (WTF::WTF_ASSERTS_ACQUIRED_LOCK): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocator::MetaAllocator): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): * wtf/MetaAllocator.h: * wtf/ParallelHelperPool.cpp: (WTF::ParallelHelperPool::ParallelHelperPool): * wtf/ParallelHelperPool.h: * wtf/RecursiveLockAdapter.h: * wtf/WorkerPool.cpp: (WTF::WorkerPool::WorkerPool): * wtf/WorkerPool.h: Tools: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * TestWebKitAPI/Tests/WTF/CheckedConditionTest.cpp: * TestWebKitAPI/Tests/WTF/Condition.cpp: * TestWebKitAPI/Tests/WTF/MetaAllocator.cpp: * WebKitTestRunner/InjectedBundle/AccessibilityController.cpp: (WTR::AXThread::createThreadIfNeeded): Canonical link: https://commits.webkit.org/238070@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@277943 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2021-05-24 05:37:41 +00:00
public:
explicit Locker(Lock& lock) WTF_ACQUIRES_LOCK(lock)
: m_lock(lock)
, m_isLocked(true)
Make CheckedLock the default Lock https://bugs.webkit.org/show_bug.cgi?id=226157 Reviewed by Darin Adler. Make CheckedLock the default Lock so that we get more benefits from Clang Thread Safety Analysis. Note that CheckedLock 100% relies on the existing Source/JavaScriptCore: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * assembler/testmasm.cpp: * dfg/DFGCommon.cpp: * dfg/DFGThreadData.h: * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::Worklist): * dfg/DFGWorklist.h: * dynbench.cpp: * heap/BlockDirectory.h: (JSC::BlockDirectory::bitvectorLock): * heap/CodeBlockSet.h: (JSC::CodeBlockSet::getLock): * heap/Heap.cpp: (JSC::Heap::Heap): * heap/Heap.h: * heap/MarkedSpace.h: (JSC::MarkedSpace::directoryLock): * heap/MarkingConstraintSolver.h: * heap/SlotVisitor.cpp: (JSC::SlotVisitor::donateKnownParallel): * heap/SlotVisitor.h: * jit/ExecutableAllocator.cpp: (JSC::ExecutableAllocator::getLock const): (JSC::dumpJITMemory): * jit/ExecutableAllocator.h: (JSC::ExecutableAllocatorBase::getLock const): * jit/JITWorklist.cpp: (JSC::JITWorklist::JITWorklist): * jit/JITWorklist.h: * jsc.cpp: * profiler/ProfilerDatabase.h: * runtime/ConcurrentJSLock.h: * runtime/DeferredWorkTimer.h: * runtime/JSLock.h: * runtime/SamplingProfiler.cpp: (JSC::FrameWalker::FrameWalker): (JSC::CFrameWalker::CFrameWalker): (JSC::SamplingProfiler::takeSample): * runtime/SamplingProfiler.h: (JSC::SamplingProfiler::getLock): * runtime/VM.h: * runtime/VMTraps.cpp: (JSC::VMTraps::invalidateCodeBlocksOnStack): (JSC::VMTraps::VMTraps): * runtime/VMTraps.h: * tools/FunctionOverrides.h: * tools/VMInspector.cpp: (JSC::ensureIsSafeToLock): * tools/VMInspector.h: (JSC::VMInspector::getLock): * wasm/WasmCalleeRegistry.h: (JSC::Wasm::CalleeRegistry::getLock): * wasm/WasmPlan.h: * wasm/WasmStreamingCompiler.h: * wasm/WasmThunks.h: * wasm/WasmWorklist.cpp: (JSC::Wasm::Worklist::Worklist): * wasm/WasmWorklist.h: Source/WebCore: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * Modules/indexeddb/server/IDBServer.cpp: * Modules/webaudio/MediaElementAudioSourceNode.h: * Modules/webdatabase/OriginLock.cpp: * bindings/js/JSDOMGlobalObject.h: * dom/Node.cpp: * html/HTMLMediaElement.cpp: (WebCore::HTMLMediaElement::createMediaPlayer): * html/canvas/WebGLContextGroup.cpp: (WebCore::WebGLContextGroup::objectGraphLockForAContext): * html/canvas/WebGLContextGroup.h: * html/canvas/WebGLContextObject.cpp: (WebCore::WebGLContextObject::objectGraphLockForContext): * html/canvas/WebGLContextObject.h: * html/canvas/WebGLObject.h: * html/canvas/WebGLRenderingContextBase.cpp: (WebCore::WebGLRenderingContextBase::objectGraphLock): * html/canvas/WebGLRenderingContextBase.h: * html/canvas/WebGLSharedObject.cpp: (WebCore::WebGLSharedObject::objectGraphLockForContext): * html/canvas/WebGLSharedObject.h: * page/scrolling/mac/ScrollingTreeMac.h: * platform/audio/ReverbConvolver.cpp: (WebCore::ReverbConvolver::backgroundThreadEntry): * platform/graphics/ShadowBlur.cpp: (WebCore::ScratchBuffer::lock): (WebCore::ShadowBlur::drawRectShadowWithTiling): (WebCore::ShadowBlur::drawInsetShadowWithTiling): * platform/graphics/gstreamer/VideoSinkGStreamer.cpp: * platform/graphics/gstreamer/eme/WebKitCommonEncryptionDecryptorGStreamer.cpp: Source/WebKit: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * GPUProcess/graphics/RemoteGraphicsContextGL.cpp: (WebKit::RemoteGraphicsContextGL::paintPixelBufferToImageBuffer): * NetworkProcess/IndexedDB/WebIDBServer.cpp: * UIProcess/API/glib/IconDatabase.h: * UIProcess/mac/WKPrintingView.mm: (-[WKPrintingView knowsPageRange:]): Source/WTF: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * wtf/AutomaticThread.cpp: (WTF::AutomaticThreadCondition::wait): (WTF::AutomaticThreadCondition::waitFor): (WTF::AutomaticThread::AutomaticThread): * wtf/AutomaticThread.h: * wtf/CheckedCondition.h: * wtf/CheckedLock.h: * wtf/Condition.h: * wtf/Lock.cpp: (WTF::UncheckedLock::lockSlow): (WTF::UncheckedLock::unlockSlow): (WTF::UncheckedLock::unlockFairlySlow): (WTF::UncheckedLock::safepointSlow): * wtf/Lock.h: (WTF::WTF_ASSERTS_ACQUIRED_LOCK): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocator::MetaAllocator): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): * wtf/MetaAllocator.h: * wtf/ParallelHelperPool.cpp: (WTF::ParallelHelperPool::ParallelHelperPool): * wtf/ParallelHelperPool.h: * wtf/RecursiveLockAdapter.h: * wtf/WorkerPool.cpp: (WTF::WorkerPool::WorkerPool): * wtf/WorkerPool.h: Tools: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * TestWebKitAPI/Tests/WTF/CheckedConditionTest.cpp: * TestWebKitAPI/Tests/WTF/Condition.cpp: * TestWebKitAPI/Tests/WTF/MetaAllocator.cpp: * WebKitTestRunner/InjectedBundle/AccessibilityController.cpp: (WTR::AXThread::createThreadIfNeeded): Canonical link: https://commits.webkit.org/238070@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@277943 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2021-05-24 05:37:41 +00:00
{
m_lock.lock();
}
Locker(AdoptLockTag, Lock& lock) WTF_REQUIRES_LOCK(lock)
: m_lock(lock)
, m_isLocked(true)
Make CheckedLock the default Lock https://bugs.webkit.org/show_bug.cgi?id=226157 Reviewed by Darin Adler. Make CheckedLock the default Lock so that we get more benefits from Clang Thread Safety Analysis. Note that CheckedLock 100% relies on the existing Source/JavaScriptCore: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * assembler/testmasm.cpp: * dfg/DFGCommon.cpp: * dfg/DFGThreadData.h: * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::Worklist): * dfg/DFGWorklist.h: * dynbench.cpp: * heap/BlockDirectory.h: (JSC::BlockDirectory::bitvectorLock): * heap/CodeBlockSet.h: (JSC::CodeBlockSet::getLock): * heap/Heap.cpp: (JSC::Heap::Heap): * heap/Heap.h: * heap/MarkedSpace.h: (JSC::MarkedSpace::directoryLock): * heap/MarkingConstraintSolver.h: * heap/SlotVisitor.cpp: (JSC::SlotVisitor::donateKnownParallel): * heap/SlotVisitor.h: * jit/ExecutableAllocator.cpp: (JSC::ExecutableAllocator::getLock const): (JSC::dumpJITMemory): * jit/ExecutableAllocator.h: (JSC::ExecutableAllocatorBase::getLock const): * jit/JITWorklist.cpp: (JSC::JITWorklist::JITWorklist): * jit/JITWorklist.h: * jsc.cpp: * profiler/ProfilerDatabase.h: * runtime/ConcurrentJSLock.h: * runtime/DeferredWorkTimer.h: * runtime/JSLock.h: * runtime/SamplingProfiler.cpp: (JSC::FrameWalker::FrameWalker): (JSC::CFrameWalker::CFrameWalker): (JSC::SamplingProfiler::takeSample): * runtime/SamplingProfiler.h: (JSC::SamplingProfiler::getLock): * runtime/VM.h: * runtime/VMTraps.cpp: (JSC::VMTraps::invalidateCodeBlocksOnStack): (JSC::VMTraps::VMTraps): * runtime/VMTraps.h: * tools/FunctionOverrides.h: * tools/VMInspector.cpp: (JSC::ensureIsSafeToLock): * tools/VMInspector.h: (JSC::VMInspector::getLock): * wasm/WasmCalleeRegistry.h: (JSC::Wasm::CalleeRegistry::getLock): * wasm/WasmPlan.h: * wasm/WasmStreamingCompiler.h: * wasm/WasmThunks.h: * wasm/WasmWorklist.cpp: (JSC::Wasm::Worklist::Worklist): * wasm/WasmWorklist.h: Source/WebCore: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * Modules/indexeddb/server/IDBServer.cpp: * Modules/webaudio/MediaElementAudioSourceNode.h: * Modules/webdatabase/OriginLock.cpp: * bindings/js/JSDOMGlobalObject.h: * dom/Node.cpp: * html/HTMLMediaElement.cpp: (WebCore::HTMLMediaElement::createMediaPlayer): * html/canvas/WebGLContextGroup.cpp: (WebCore::WebGLContextGroup::objectGraphLockForAContext): * html/canvas/WebGLContextGroup.h: * html/canvas/WebGLContextObject.cpp: (WebCore::WebGLContextObject::objectGraphLockForContext): * html/canvas/WebGLContextObject.h: * html/canvas/WebGLObject.h: * html/canvas/WebGLRenderingContextBase.cpp: (WebCore::WebGLRenderingContextBase::objectGraphLock): * html/canvas/WebGLRenderingContextBase.h: * html/canvas/WebGLSharedObject.cpp: (WebCore::WebGLSharedObject::objectGraphLockForContext): * html/canvas/WebGLSharedObject.h: * page/scrolling/mac/ScrollingTreeMac.h: * platform/audio/ReverbConvolver.cpp: (WebCore::ReverbConvolver::backgroundThreadEntry): * platform/graphics/ShadowBlur.cpp: (WebCore::ScratchBuffer::lock): (WebCore::ShadowBlur::drawRectShadowWithTiling): (WebCore::ShadowBlur::drawInsetShadowWithTiling): * platform/graphics/gstreamer/VideoSinkGStreamer.cpp: * platform/graphics/gstreamer/eme/WebKitCommonEncryptionDecryptorGStreamer.cpp: Source/WebKit: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * GPUProcess/graphics/RemoteGraphicsContextGL.cpp: (WebKit::RemoteGraphicsContextGL::paintPixelBufferToImageBuffer): * NetworkProcess/IndexedDB/WebIDBServer.cpp: * UIProcess/API/glib/IconDatabase.h: * UIProcess/mac/WKPrintingView.mm: (-[WKPrintingView knowsPageRange:]): Source/WTF: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * wtf/AutomaticThread.cpp: (WTF::AutomaticThreadCondition::wait): (WTF::AutomaticThreadCondition::waitFor): (WTF::AutomaticThread::AutomaticThread): * wtf/AutomaticThread.h: * wtf/CheckedCondition.h: * wtf/CheckedLock.h: * wtf/Condition.h: * wtf/Lock.cpp: (WTF::UncheckedLock::lockSlow): (WTF::UncheckedLock::unlockSlow): (WTF::UncheckedLock::unlockFairlySlow): (WTF::UncheckedLock::safepointSlow): * wtf/Lock.h: (WTF::WTF_ASSERTS_ACQUIRED_LOCK): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocator::MetaAllocator): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): * wtf/MetaAllocator.h: * wtf/ParallelHelperPool.cpp: (WTF::ParallelHelperPool::ParallelHelperPool): * wtf/ParallelHelperPool.h: * wtf/RecursiveLockAdapter.h: * wtf/WorkerPool.cpp: (WTF::WorkerPool::WorkerPool): * wtf/WorkerPool.h: Tools: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * TestWebKitAPI/Tests/WTF/CheckedConditionTest.cpp: * TestWebKitAPI/Tests/WTF/Condition.cpp: * TestWebKitAPI/Tests/WTF/MetaAllocator.cpp: * WebKitTestRunner/InjectedBundle/AccessibilityController.cpp: (WTR::AXThread::createThreadIfNeeded): Canonical link: https://commits.webkit.org/238070@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@277943 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2021-05-24 05:37:41 +00:00
{
}
~Locker() WTF_RELEASES_LOCK()
{
if (m_isLocked)
m_lock.unlock();
}
void unlockEarly() WTF_RELEASES_LOCK()
{
ASSERT(m_isLocked);
m_isLocked = false;
Make CheckedLock the default Lock https://bugs.webkit.org/show_bug.cgi?id=226157 Reviewed by Darin Adler. Make CheckedLock the default Lock so that we get more benefits from Clang Thread Safety Analysis. Note that CheckedLock 100% relies on the existing Source/JavaScriptCore: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * assembler/testmasm.cpp: * dfg/DFGCommon.cpp: * dfg/DFGThreadData.h: * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::Worklist): * dfg/DFGWorklist.h: * dynbench.cpp: * heap/BlockDirectory.h: (JSC::BlockDirectory::bitvectorLock): * heap/CodeBlockSet.h: (JSC::CodeBlockSet::getLock): * heap/Heap.cpp: (JSC::Heap::Heap): * heap/Heap.h: * heap/MarkedSpace.h: (JSC::MarkedSpace::directoryLock): * heap/MarkingConstraintSolver.h: * heap/SlotVisitor.cpp: (JSC::SlotVisitor::donateKnownParallel): * heap/SlotVisitor.h: * jit/ExecutableAllocator.cpp: (JSC::ExecutableAllocator::getLock const): (JSC::dumpJITMemory): * jit/ExecutableAllocator.h: (JSC::ExecutableAllocatorBase::getLock const): * jit/JITWorklist.cpp: (JSC::JITWorklist::JITWorklist): * jit/JITWorklist.h: * jsc.cpp: * profiler/ProfilerDatabase.h: * runtime/ConcurrentJSLock.h: * runtime/DeferredWorkTimer.h: * runtime/JSLock.h: * runtime/SamplingProfiler.cpp: (JSC::FrameWalker::FrameWalker): (JSC::CFrameWalker::CFrameWalker): (JSC::SamplingProfiler::takeSample): * runtime/SamplingProfiler.h: (JSC::SamplingProfiler::getLock): * runtime/VM.h: * runtime/VMTraps.cpp: (JSC::VMTraps::invalidateCodeBlocksOnStack): (JSC::VMTraps::VMTraps): * runtime/VMTraps.h: * tools/FunctionOverrides.h: * tools/VMInspector.cpp: (JSC::ensureIsSafeToLock): * tools/VMInspector.h: (JSC::VMInspector::getLock): * wasm/WasmCalleeRegistry.h: (JSC::Wasm::CalleeRegistry::getLock): * wasm/WasmPlan.h: * wasm/WasmStreamingCompiler.h: * wasm/WasmThunks.h: * wasm/WasmWorklist.cpp: (JSC::Wasm::Worklist::Worklist): * wasm/WasmWorklist.h: Source/WebCore: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * Modules/indexeddb/server/IDBServer.cpp: * Modules/webaudio/MediaElementAudioSourceNode.h: * Modules/webdatabase/OriginLock.cpp: * bindings/js/JSDOMGlobalObject.h: * dom/Node.cpp: * html/HTMLMediaElement.cpp: (WebCore::HTMLMediaElement::createMediaPlayer): * html/canvas/WebGLContextGroup.cpp: (WebCore::WebGLContextGroup::objectGraphLockForAContext): * html/canvas/WebGLContextGroup.h: * html/canvas/WebGLContextObject.cpp: (WebCore::WebGLContextObject::objectGraphLockForContext): * html/canvas/WebGLContextObject.h: * html/canvas/WebGLObject.h: * html/canvas/WebGLRenderingContextBase.cpp: (WebCore::WebGLRenderingContextBase::objectGraphLock): * html/canvas/WebGLRenderingContextBase.h: * html/canvas/WebGLSharedObject.cpp: (WebCore::WebGLSharedObject::objectGraphLockForContext): * html/canvas/WebGLSharedObject.h: * page/scrolling/mac/ScrollingTreeMac.h: * platform/audio/ReverbConvolver.cpp: (WebCore::ReverbConvolver::backgroundThreadEntry): * platform/graphics/ShadowBlur.cpp: (WebCore::ScratchBuffer::lock): (WebCore::ShadowBlur::drawRectShadowWithTiling): (WebCore::ShadowBlur::drawInsetShadowWithTiling): * platform/graphics/gstreamer/VideoSinkGStreamer.cpp: * platform/graphics/gstreamer/eme/WebKitCommonEncryptionDecryptorGStreamer.cpp: Source/WebKit: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * GPUProcess/graphics/RemoteGraphicsContextGL.cpp: (WebKit::RemoteGraphicsContextGL::paintPixelBufferToImageBuffer): * NetworkProcess/IndexedDB/WebIDBServer.cpp: * UIProcess/API/glib/IconDatabase.h: * UIProcess/mac/WKPrintingView.mm: (-[WKPrintingView knowsPageRange:]): Source/WTF: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * wtf/AutomaticThread.cpp: (WTF::AutomaticThreadCondition::wait): (WTF::AutomaticThreadCondition::waitFor): (WTF::AutomaticThread::AutomaticThread): * wtf/AutomaticThread.h: * wtf/CheckedCondition.h: * wtf/CheckedLock.h: * wtf/Condition.h: * wtf/Lock.cpp: (WTF::UncheckedLock::lockSlow): (WTF::UncheckedLock::unlockSlow): (WTF::UncheckedLock::unlockFairlySlow): (WTF::UncheckedLock::safepointSlow): * wtf/Lock.h: (WTF::WTF_ASSERTS_ACQUIRED_LOCK): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocator::MetaAllocator): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): * wtf/MetaAllocator.h: * wtf/ParallelHelperPool.cpp: (WTF::ParallelHelperPool::ParallelHelperPool): * wtf/ParallelHelperPool.h: * wtf/RecursiveLockAdapter.h: * wtf/WorkerPool.cpp: (WTF::WorkerPool::WorkerPool): * wtf/WorkerPool.h: Tools: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * TestWebKitAPI/Tests/WTF/CheckedConditionTest.cpp: * TestWebKitAPI/Tests/WTF/Condition.cpp: * TestWebKitAPI/Tests/WTF/MetaAllocator.cpp: * WebKitTestRunner/InjectedBundle/AccessibilityController.cpp: (WTR::AXThread::createThreadIfNeeded): Canonical link: https://commits.webkit.org/238070@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@277943 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2021-05-24 05:37:41 +00:00
m_lock.unlock();
}
Locker(const Locker<Lock>&) = delete;
Locker& operator=(const Locker<Lock>&) = delete;
Stop using UncheckedLock in html/canvas https://bugs.webkit.org/show_bug.cgi?id=226186 Reviewed by Darin Adler. Source/WebCore: Stop using UncheckedLock in html/canvas. This is a step towards phasing out UncheckedLock, in favor of the checked Lock. Technically, the code still doesn't do much thread-safety analysis after this change. It is very difficult to adopt thread-safety analysis here because the call sites don't always lock (there are cases where no locking is needed). It is also hard to get a reference to the various locks to make WTF_REQUIRES_LOCK() work. * html/canvas/WebGL2RenderingContext.cpp: (WebCore::WebGL2RenderingContext::deleteVertexArray): * html/canvas/WebGLContextGroup.cpp: (WebCore::WebGLContextGroup::objectGraphLockForAContext): * html/canvas/WebGLContextGroup.h: * html/canvas/WebGLContextObject.cpp: (WebCore::WebGLContextObject::objectGraphLockForContext): * html/canvas/WebGLContextObject.h: * html/canvas/WebGLObject.h: * html/canvas/WebGLRenderingContextBase.cpp: (WebCore::WebGLRenderingContextBase::objectGraphLock): * html/canvas/WebGLRenderingContextBase.h: * html/canvas/WebGLSharedObject.cpp: (WebCore::WebGLSharedObject::objectGraphLockForContext): * html/canvas/WebGLSharedObject.h: Source/WTF: Allow converting a Locker<Lock> to an AbstractLocker type. This allows porting code from UncheckedLock to Lock even if said code is passing the Locker<Lock> as parameter around as an AbstractLocker. This is very common in JSC and in html/canvas. Also make DropLockForScope work with Locker<Lock> to help port code over. * wtf/Lock.h: Canonical link: https://commits.webkit.org/238141@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@278057 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2021-05-25 23:19:19 +00:00
Make CheckedLock the default Lock https://bugs.webkit.org/show_bug.cgi?id=226157 Reviewed by Darin Adler. Make CheckedLock the default Lock so that we get more benefits from Clang Thread Safety Analysis. Note that CheckedLock 100% relies on the existing Source/JavaScriptCore: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * assembler/testmasm.cpp: * dfg/DFGCommon.cpp: * dfg/DFGThreadData.h: * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::Worklist): * dfg/DFGWorklist.h: * dynbench.cpp: * heap/BlockDirectory.h: (JSC::BlockDirectory::bitvectorLock): * heap/CodeBlockSet.h: (JSC::CodeBlockSet::getLock): * heap/Heap.cpp: (JSC::Heap::Heap): * heap/Heap.h: * heap/MarkedSpace.h: (JSC::MarkedSpace::directoryLock): * heap/MarkingConstraintSolver.h: * heap/SlotVisitor.cpp: (JSC::SlotVisitor::donateKnownParallel): * heap/SlotVisitor.h: * jit/ExecutableAllocator.cpp: (JSC::ExecutableAllocator::getLock const): (JSC::dumpJITMemory): * jit/ExecutableAllocator.h: (JSC::ExecutableAllocatorBase::getLock const): * jit/JITWorklist.cpp: (JSC::JITWorklist::JITWorklist): * jit/JITWorklist.h: * jsc.cpp: * profiler/ProfilerDatabase.h: * runtime/ConcurrentJSLock.h: * runtime/DeferredWorkTimer.h: * runtime/JSLock.h: * runtime/SamplingProfiler.cpp: (JSC::FrameWalker::FrameWalker): (JSC::CFrameWalker::CFrameWalker): (JSC::SamplingProfiler::takeSample): * runtime/SamplingProfiler.h: (JSC::SamplingProfiler::getLock): * runtime/VM.h: * runtime/VMTraps.cpp: (JSC::VMTraps::invalidateCodeBlocksOnStack): (JSC::VMTraps::VMTraps): * runtime/VMTraps.h: * tools/FunctionOverrides.h: * tools/VMInspector.cpp: (JSC::ensureIsSafeToLock): * tools/VMInspector.h: (JSC::VMInspector::getLock): * wasm/WasmCalleeRegistry.h: (JSC::Wasm::CalleeRegistry::getLock): * wasm/WasmPlan.h: * wasm/WasmStreamingCompiler.h: * wasm/WasmThunks.h: * wasm/WasmWorklist.cpp: (JSC::Wasm::Worklist::Worklist): * wasm/WasmWorklist.h: Source/WebCore: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * Modules/indexeddb/server/IDBServer.cpp: * Modules/webaudio/MediaElementAudioSourceNode.h: * Modules/webdatabase/OriginLock.cpp: * bindings/js/JSDOMGlobalObject.h: * dom/Node.cpp: * html/HTMLMediaElement.cpp: (WebCore::HTMLMediaElement::createMediaPlayer): * html/canvas/WebGLContextGroup.cpp: (WebCore::WebGLContextGroup::objectGraphLockForAContext): * html/canvas/WebGLContextGroup.h: * html/canvas/WebGLContextObject.cpp: (WebCore::WebGLContextObject::objectGraphLockForContext): * html/canvas/WebGLContextObject.h: * html/canvas/WebGLObject.h: * html/canvas/WebGLRenderingContextBase.cpp: (WebCore::WebGLRenderingContextBase::objectGraphLock): * html/canvas/WebGLRenderingContextBase.h: * html/canvas/WebGLSharedObject.cpp: (WebCore::WebGLSharedObject::objectGraphLockForContext): * html/canvas/WebGLSharedObject.h: * page/scrolling/mac/ScrollingTreeMac.h: * platform/audio/ReverbConvolver.cpp: (WebCore::ReverbConvolver::backgroundThreadEntry): * platform/graphics/ShadowBlur.cpp: (WebCore::ScratchBuffer::lock): (WebCore::ShadowBlur::drawRectShadowWithTiling): (WebCore::ShadowBlur::drawInsetShadowWithTiling): * platform/graphics/gstreamer/VideoSinkGStreamer.cpp: * platform/graphics/gstreamer/eme/WebKitCommonEncryptionDecryptorGStreamer.cpp: Source/WebKit: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * GPUProcess/graphics/RemoteGraphicsContextGL.cpp: (WebKit::RemoteGraphicsContextGL::paintPixelBufferToImageBuffer): * NetworkProcess/IndexedDB/WebIDBServer.cpp: * UIProcess/API/glib/IconDatabase.h: * UIProcess/mac/WKPrintingView.mm: (-[WKPrintingView knowsPageRange:]): Source/WTF: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * wtf/AutomaticThread.cpp: (WTF::AutomaticThreadCondition::wait): (WTF::AutomaticThreadCondition::waitFor): (WTF::AutomaticThread::AutomaticThread): * wtf/AutomaticThread.h: * wtf/CheckedCondition.h: * wtf/CheckedLock.h: * wtf/Condition.h: * wtf/Lock.cpp: (WTF::UncheckedLock::lockSlow): (WTF::UncheckedLock::unlockSlow): (WTF::UncheckedLock::unlockFairlySlow): (WTF::UncheckedLock::safepointSlow): * wtf/Lock.h: (WTF::WTF_ASSERTS_ACQUIRED_LOCK): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocator::MetaAllocator): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): * wtf/MetaAllocator.h: * wtf/ParallelHelperPool.cpp: (WTF::ParallelHelperPool::ParallelHelperPool): * wtf/ParallelHelperPool.h: * wtf/RecursiveLockAdapter.h: * wtf/WorkerPool.cpp: (WTF::WorkerPool::WorkerPool): * wtf/WorkerPool.h: Tools: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * TestWebKitAPI/Tests/WTF/CheckedConditionTest.cpp: * TestWebKitAPI/Tests/WTF/Condition.cpp: * TestWebKitAPI/Tests/WTF/MetaAllocator.cpp: * WebKitTestRunner/InjectedBundle/AccessibilityController.cpp: (WTR::AXThread::createThreadIfNeeded): Canonical link: https://commits.webkit.org/238070@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@277943 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2021-05-24 05:37:41 +00:00
private:
Stop using UncheckedLock in html/canvas https://bugs.webkit.org/show_bug.cgi?id=226186 Reviewed by Darin Adler. Source/WebCore: Stop using UncheckedLock in html/canvas. This is a step towards phasing out UncheckedLock, in favor of the checked Lock. Technically, the code still doesn't do much thread-safety analysis after this change. It is very difficult to adopt thread-safety analysis here because the call sites don't always lock (there are cases where no locking is needed). It is also hard to get a reference to the various locks to make WTF_REQUIRES_LOCK() work. * html/canvas/WebGL2RenderingContext.cpp: (WebCore::WebGL2RenderingContext::deleteVertexArray): * html/canvas/WebGLContextGroup.cpp: (WebCore::WebGLContextGroup::objectGraphLockForAContext): * html/canvas/WebGLContextGroup.h: * html/canvas/WebGLContextObject.cpp: (WebCore::WebGLContextObject::objectGraphLockForContext): * html/canvas/WebGLContextObject.h: * html/canvas/WebGLObject.h: * html/canvas/WebGLRenderingContextBase.cpp: (WebCore::WebGLRenderingContextBase::objectGraphLock): * html/canvas/WebGLRenderingContextBase.h: * html/canvas/WebGLSharedObject.cpp: (WebCore::WebGLSharedObject::objectGraphLockForContext): * html/canvas/WebGLSharedObject.h: Source/WTF: Allow converting a Locker<Lock> to an AbstractLocker type. This allows porting code from UncheckedLock to Lock even if said code is passing the Locker<Lock> as parameter around as an AbstractLocker. This is very common in JSC and in html/canvas. Also make DropLockForScope work with Locker<Lock> to help port code over. * wtf/Lock.h: Canonical link: https://commits.webkit.org/238141@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@278057 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2021-05-25 23:19:19 +00:00
// Support DropLockForScope even though it doesn't support thread safety analysis.
template<typename>
friend class DropLockForScope;
void lock() WTF_ACQUIRES_LOCK(m_lock)
{
m_lock.lock();
compilerFence();
}
void unlock() WTF_RELEASES_LOCK(m_lock)
{
compilerFence();
m_lock.unlock();
}
Make CheckedLock the default Lock https://bugs.webkit.org/show_bug.cgi?id=226157 Reviewed by Darin Adler. Make CheckedLock the default Lock so that we get more benefits from Clang Thread Safety Analysis. Note that CheckedLock 100% relies on the existing Source/JavaScriptCore: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * assembler/testmasm.cpp: * dfg/DFGCommon.cpp: * dfg/DFGThreadData.h: * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::Worklist): * dfg/DFGWorklist.h: * dynbench.cpp: * heap/BlockDirectory.h: (JSC::BlockDirectory::bitvectorLock): * heap/CodeBlockSet.h: (JSC::CodeBlockSet::getLock): * heap/Heap.cpp: (JSC::Heap::Heap): * heap/Heap.h: * heap/MarkedSpace.h: (JSC::MarkedSpace::directoryLock): * heap/MarkingConstraintSolver.h: * heap/SlotVisitor.cpp: (JSC::SlotVisitor::donateKnownParallel): * heap/SlotVisitor.h: * jit/ExecutableAllocator.cpp: (JSC::ExecutableAllocator::getLock const): (JSC::dumpJITMemory): * jit/ExecutableAllocator.h: (JSC::ExecutableAllocatorBase::getLock const): * jit/JITWorklist.cpp: (JSC::JITWorklist::JITWorklist): * jit/JITWorklist.h: * jsc.cpp: * profiler/ProfilerDatabase.h: * runtime/ConcurrentJSLock.h: * runtime/DeferredWorkTimer.h: * runtime/JSLock.h: * runtime/SamplingProfiler.cpp: (JSC::FrameWalker::FrameWalker): (JSC::CFrameWalker::CFrameWalker): (JSC::SamplingProfiler::takeSample): * runtime/SamplingProfiler.h: (JSC::SamplingProfiler::getLock): * runtime/VM.h: * runtime/VMTraps.cpp: (JSC::VMTraps::invalidateCodeBlocksOnStack): (JSC::VMTraps::VMTraps): * runtime/VMTraps.h: * tools/FunctionOverrides.h: * tools/VMInspector.cpp: (JSC::ensureIsSafeToLock): * tools/VMInspector.h: (JSC::VMInspector::getLock): * wasm/WasmCalleeRegistry.h: (JSC::Wasm::CalleeRegistry::getLock): * wasm/WasmPlan.h: * wasm/WasmStreamingCompiler.h: * wasm/WasmThunks.h: * wasm/WasmWorklist.cpp: (JSC::Wasm::Worklist::Worklist): * wasm/WasmWorklist.h: Source/WebCore: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * Modules/indexeddb/server/IDBServer.cpp: * Modules/webaudio/MediaElementAudioSourceNode.h: * Modules/webdatabase/OriginLock.cpp: * bindings/js/JSDOMGlobalObject.h: * dom/Node.cpp: * html/HTMLMediaElement.cpp: (WebCore::HTMLMediaElement::createMediaPlayer): * html/canvas/WebGLContextGroup.cpp: (WebCore::WebGLContextGroup::objectGraphLockForAContext): * html/canvas/WebGLContextGroup.h: * html/canvas/WebGLContextObject.cpp: (WebCore::WebGLContextObject::objectGraphLockForContext): * html/canvas/WebGLContextObject.h: * html/canvas/WebGLObject.h: * html/canvas/WebGLRenderingContextBase.cpp: (WebCore::WebGLRenderingContextBase::objectGraphLock): * html/canvas/WebGLRenderingContextBase.h: * html/canvas/WebGLSharedObject.cpp: (WebCore::WebGLSharedObject::objectGraphLockForContext): * html/canvas/WebGLSharedObject.h: * page/scrolling/mac/ScrollingTreeMac.h: * platform/audio/ReverbConvolver.cpp: (WebCore::ReverbConvolver::backgroundThreadEntry): * platform/graphics/ShadowBlur.cpp: (WebCore::ScratchBuffer::lock): (WebCore::ShadowBlur::drawRectShadowWithTiling): (WebCore::ShadowBlur::drawInsetShadowWithTiling): * platform/graphics/gstreamer/VideoSinkGStreamer.cpp: * platform/graphics/gstreamer/eme/WebKitCommonEncryptionDecryptorGStreamer.cpp: Source/WebKit: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * GPUProcess/graphics/RemoteGraphicsContextGL.cpp: (WebKit::RemoteGraphicsContextGL::paintPixelBufferToImageBuffer): * NetworkProcess/IndexedDB/WebIDBServer.cpp: * UIProcess/API/glib/IconDatabase.h: * UIProcess/mac/WKPrintingView.mm: (-[WKPrintingView knowsPageRange:]): Source/WTF: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * wtf/AutomaticThread.cpp: (WTF::AutomaticThreadCondition::wait): (WTF::AutomaticThreadCondition::waitFor): (WTF::AutomaticThread::AutomaticThread): * wtf/AutomaticThread.h: * wtf/CheckedCondition.h: * wtf/CheckedLock.h: * wtf/Condition.h: * wtf/Lock.cpp: (WTF::UncheckedLock::lockSlow): (WTF::UncheckedLock::unlockSlow): (WTF::UncheckedLock::unlockFairlySlow): (WTF::UncheckedLock::safepointSlow): * wtf/Lock.h: (WTF::WTF_ASSERTS_ACQUIRED_LOCK): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocator::MetaAllocator): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): * wtf/MetaAllocator.h: * wtf/ParallelHelperPool.cpp: (WTF::ParallelHelperPool::ParallelHelperPool): * wtf/ParallelHelperPool.h: * wtf/RecursiveLockAdapter.h: * wtf/WorkerPool.cpp: (WTF::WorkerPool::WorkerPool): * wtf/WorkerPool.h: Tools: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * TestWebKitAPI/Tests/WTF/CheckedConditionTest.cpp: * TestWebKitAPI/Tests/WTF/Condition.cpp: * TestWebKitAPI/Tests/WTF/MetaAllocator.cpp: * WebKitTestRunner/InjectedBundle/AccessibilityController.cpp: (WTR::AXThread::createThreadIfNeeded): Canonical link: https://commits.webkit.org/238070@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@277943 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2021-05-24 05:37:41 +00:00
Lock& m_lock;
bool m_isLocked { false };
Make CheckedLock the default Lock https://bugs.webkit.org/show_bug.cgi?id=226157 Reviewed by Darin Adler. Make CheckedLock the default Lock so that we get more benefits from Clang Thread Safety Analysis. Note that CheckedLock 100% relies on the existing Source/JavaScriptCore: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * assembler/testmasm.cpp: * dfg/DFGCommon.cpp: * dfg/DFGThreadData.h: * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::Worklist): * dfg/DFGWorklist.h: * dynbench.cpp: * heap/BlockDirectory.h: (JSC::BlockDirectory::bitvectorLock): * heap/CodeBlockSet.h: (JSC::CodeBlockSet::getLock): * heap/Heap.cpp: (JSC::Heap::Heap): * heap/Heap.h: * heap/MarkedSpace.h: (JSC::MarkedSpace::directoryLock): * heap/MarkingConstraintSolver.h: * heap/SlotVisitor.cpp: (JSC::SlotVisitor::donateKnownParallel): * heap/SlotVisitor.h: * jit/ExecutableAllocator.cpp: (JSC::ExecutableAllocator::getLock const): (JSC::dumpJITMemory): * jit/ExecutableAllocator.h: (JSC::ExecutableAllocatorBase::getLock const): * jit/JITWorklist.cpp: (JSC::JITWorklist::JITWorklist): * jit/JITWorklist.h: * jsc.cpp: * profiler/ProfilerDatabase.h: * runtime/ConcurrentJSLock.h: * runtime/DeferredWorkTimer.h: * runtime/JSLock.h: * runtime/SamplingProfiler.cpp: (JSC::FrameWalker::FrameWalker): (JSC::CFrameWalker::CFrameWalker): (JSC::SamplingProfiler::takeSample): * runtime/SamplingProfiler.h: (JSC::SamplingProfiler::getLock): * runtime/VM.h: * runtime/VMTraps.cpp: (JSC::VMTraps::invalidateCodeBlocksOnStack): (JSC::VMTraps::VMTraps): * runtime/VMTraps.h: * tools/FunctionOverrides.h: * tools/VMInspector.cpp: (JSC::ensureIsSafeToLock): * tools/VMInspector.h: (JSC::VMInspector::getLock): * wasm/WasmCalleeRegistry.h: (JSC::Wasm::CalleeRegistry::getLock): * wasm/WasmPlan.h: * wasm/WasmStreamingCompiler.h: * wasm/WasmThunks.h: * wasm/WasmWorklist.cpp: (JSC::Wasm::Worklist::Worklist): * wasm/WasmWorklist.h: Source/WebCore: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * Modules/indexeddb/server/IDBServer.cpp: * Modules/webaudio/MediaElementAudioSourceNode.h: * Modules/webdatabase/OriginLock.cpp: * bindings/js/JSDOMGlobalObject.h: * dom/Node.cpp: * html/HTMLMediaElement.cpp: (WebCore::HTMLMediaElement::createMediaPlayer): * html/canvas/WebGLContextGroup.cpp: (WebCore::WebGLContextGroup::objectGraphLockForAContext): * html/canvas/WebGLContextGroup.h: * html/canvas/WebGLContextObject.cpp: (WebCore::WebGLContextObject::objectGraphLockForContext): * html/canvas/WebGLContextObject.h: * html/canvas/WebGLObject.h: * html/canvas/WebGLRenderingContextBase.cpp: (WebCore::WebGLRenderingContextBase::objectGraphLock): * html/canvas/WebGLRenderingContextBase.h: * html/canvas/WebGLSharedObject.cpp: (WebCore::WebGLSharedObject::objectGraphLockForContext): * html/canvas/WebGLSharedObject.h: * page/scrolling/mac/ScrollingTreeMac.h: * platform/audio/ReverbConvolver.cpp: (WebCore::ReverbConvolver::backgroundThreadEntry): * platform/graphics/ShadowBlur.cpp: (WebCore::ScratchBuffer::lock): (WebCore::ShadowBlur::drawRectShadowWithTiling): (WebCore::ShadowBlur::drawInsetShadowWithTiling): * platform/graphics/gstreamer/VideoSinkGStreamer.cpp: * platform/graphics/gstreamer/eme/WebKitCommonEncryptionDecryptorGStreamer.cpp: Source/WebKit: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * GPUProcess/graphics/RemoteGraphicsContextGL.cpp: (WebKit::RemoteGraphicsContextGL::paintPixelBufferToImageBuffer): * NetworkProcess/IndexedDB/WebIDBServer.cpp: * UIProcess/API/glib/IconDatabase.h: * UIProcess/mac/WKPrintingView.mm: (-[WKPrintingView knowsPageRange:]): Source/WTF: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * wtf/AutomaticThread.cpp: (WTF::AutomaticThreadCondition::wait): (WTF::AutomaticThreadCondition::waitFor): (WTF::AutomaticThread::AutomaticThread): * wtf/AutomaticThread.h: * wtf/CheckedCondition.h: * wtf/CheckedLock.h: * wtf/Condition.h: * wtf/Lock.cpp: (WTF::UncheckedLock::lockSlow): (WTF::UncheckedLock::unlockSlow): (WTF::UncheckedLock::unlockFairlySlow): (WTF::UncheckedLock::safepointSlow): * wtf/Lock.h: (WTF::WTF_ASSERTS_ACQUIRED_LOCK): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocator::MetaAllocator): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): * wtf/MetaAllocator.h: * wtf/ParallelHelperPool.cpp: (WTF::ParallelHelperPool::ParallelHelperPool): * wtf/ParallelHelperPool.h: * wtf/RecursiveLockAdapter.h: * wtf/WorkerPool.cpp: (WTF::WorkerPool::WorkerPool): * wtf/WorkerPool.h: Tools: Lock implementation and merely adds the clang anotations for thread safety. That this patch does is: 1. Rename the Lock class to UncheckedLock 2. Rename the CheckedLock class to Lock 3. Rename the Condition class to UncheckedCondition 4. Rename the CheckedCondition class to Condition 5. Update the types of certain variables from Lock / Condition to UncheckedLock / UncheckedCondition if I got a build failure. Build failures are usually caused by the following facts: - Locker<CheckedLock> doesn't subclass AbstractLocker which a lot of JSC code passes as argument - Locker<CheckedLock> has no move constructor - Locker<CheckedLock> cannot be constructed from a lock pointer, only a reference For now, CheckedLock and CheckedCondition remain as aliases to Lock and Condition, in their respective CheckedLock.h / CheckedCondition.h headers. I will drop them in a follow-up to reduce patch size. I will also follow-up to try and get rid of as much usage of UncheckedLock and UncheckedCondition as possible. I did not try very hard in this patch to reduce patch size. * TestWebKitAPI/Tests/WTF/CheckedConditionTest.cpp: * TestWebKitAPI/Tests/WTF/Condition.cpp: * TestWebKitAPI/Tests/WTF/MetaAllocator.cpp: * WebKitTestRunner/InjectedBundle/AccessibilityController.cpp: (WTR::AXThread::createThreadIfNeeded): Canonical link: https://commits.webkit.org/238070@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@277943 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2021-05-24 05:37:41 +00:00
};
Locker(Lock&) -> Locker<Lock>;
Locker(AdoptLockTag, Lock&) -> Locker<Lock>;
[WTF] Remove XXXLockBase since constexpr constructor can initialize static variables without calling global constructors https://bugs.webkit.org/show_bug.cgi?id=180495 Reviewed by Mark Lam. Very nice feature of C++11 is that constexpr constructor can initialize static global variables without calling global constructors. We do not need to have XXXLockBase with derived XXXLock class since StaticXXXLock can have constructors as long as it is constexpr. We remove bunch of these classes, and set `XXXLock() = default;` explicitly for readability. C++11's default constructor is constexpr as long as its member's default constructor / default initializer is constexpr. * wtf/Condition.h: (WTF::ConditionBase::construct): Deleted. (WTF::ConditionBase::waitUntil): Deleted. (WTF::ConditionBase::waitFor): Deleted. (WTF::ConditionBase::wait): Deleted. (WTF::ConditionBase::notifyOne): Deleted. (WTF::ConditionBase::notifyAll): Deleted. (WTF::Condition::Condition): Deleted. * wtf/CountingLock.h: (WTF::CountingLock::CountingLock): Deleted. (WTF::CountingLock::~CountingLock): Deleted. * wtf/Lock.cpp: (WTF::Lock::lockSlow): (WTF::Lock::unlockSlow): (WTF::Lock::unlockFairlySlow): (WTF::Lock::safepointSlow): (WTF::LockBase::lockSlow): Deleted. (WTF::LockBase::unlockSlow): Deleted. (WTF::LockBase::unlockFairlySlow): Deleted. (WTF::LockBase::safepointSlow): Deleted. * wtf/Lock.h: (WTF::LockBase::construct): Deleted. (WTF::LockBase::lock): Deleted. (WTF::LockBase::tryLock): Deleted. (WTF::LockBase::try_lock): Deleted. (WTF::LockBase::unlock): Deleted. (WTF::LockBase::unlockFairly): Deleted. (WTF::LockBase::safepoint): Deleted. (WTF::LockBase::isHeld const): Deleted. (WTF::LockBase::isLocked const): Deleted. (WTF::LockBase::isFullyReset const): Deleted. (WTF::Lock::Lock): Deleted. * wtf/ReadWriteLock.cpp: (WTF::ReadWriteLock::readLock): (WTF::ReadWriteLock::readUnlock): (WTF::ReadWriteLock::writeLock): (WTF::ReadWriteLock::writeUnlock): (WTF::ReadWriteLockBase::construct): Deleted. (WTF::ReadWriteLockBase::readLock): Deleted. (WTF::ReadWriteLockBase::readUnlock): Deleted. (WTF::ReadWriteLockBase::writeLock): Deleted. (WTF::ReadWriteLockBase::writeUnlock): Deleted. * wtf/ReadWriteLock.h: (WTF::ReadWriteLock::read): (WTF::ReadWriteLock::write): (WTF::ReadWriteLockBase::ReadLock::tryLock): Deleted. (WTF::ReadWriteLockBase::ReadLock::lock): Deleted. (WTF::ReadWriteLockBase::ReadLock::unlock): Deleted. (WTF::ReadWriteLockBase::WriteLock::tryLock): Deleted. (WTF::ReadWriteLockBase::WriteLock::lock): Deleted. (WTF::ReadWriteLockBase::WriteLock::unlock): Deleted. (WTF::ReadWriteLockBase::read): Deleted. (WTF::ReadWriteLockBase::write): Deleted. (WTF::ReadWriteLock::ReadWriteLock): Deleted. * wtf/RecursiveLockAdapter.h: (WTF::RecursiveLockAdapter::RecursiveLockAdapter): Deleted. * wtf/WordLock.cpp: (WTF::WordLock::lockSlow): (WTF::WordLock::unlockSlow): (WTF::WordLockBase::lockSlow): Deleted. (WTF::WordLockBase::unlockSlow): Deleted. * wtf/WordLock.h: (WTF::WordLockBase::lock): Deleted. (WTF::WordLockBase::unlock): Deleted. (WTF::WordLockBase::isHeld const): Deleted. (WTF::WordLockBase::isLocked const): Deleted. (WTF::WordLockBase::isFullyReset const): Deleted. (WTF::WordLock::WordLock): Deleted. * wtf/WorkQueue.cpp: Canonical link: https://commits.webkit.org/196438@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225617 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-07 03:52:09 +00:00
using LockHolder = Locker<Lock>;
Lightweight locks should be adaptive https://bugs.webkit.org/show_bug.cgi?id=147545 Reviewed by Geoffrey Garen. Source/JavaScriptCore: * dfg/DFGCommon.cpp: (JSC::DFG::startCrashing): * heap/CopiedBlock.h: (JSC::CopiedBlock::workListLock): * heap/CopiedBlockInlines.h: (JSC::CopiedBlock::shouldReportLiveBytes): (JSC::CopiedBlock::reportLiveBytes): * heap/CopiedSpace.cpp: (JSC::CopiedSpace::doneFillingBlock): * heap/CopiedSpace.h: (JSC::CopiedSpace::CopiedGeneration::CopiedGeneration): * heap/CopiedSpaceInlines.h: (JSC::CopiedSpace::recycleEvacuatedBlock): * heap/GCThreadSharedData.cpp: (JSC::GCThreadSharedData::didStartCopying): * heap/GCThreadSharedData.h: (JSC::GCThreadSharedData::getNextBlocksToCopy): * heap/ListableHandler.h: (JSC::ListableHandler::List::addThreadSafe): (JSC::ListableHandler::List::addNotThreadSafe): * heap/MachineStackMarker.cpp: (JSC::MachineThreads::tryCopyOtherThreadStacks): * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::copyLater): * parser/SourceProvider.cpp: (JSC::SourceProvider::~SourceProvider): (JSC::SourceProvider::getID): * profiler/ProfilerDatabase.cpp: (JSC::Profiler::Database::addDatabaseToAtExit): (JSC::Profiler::Database::removeDatabaseFromAtExit): (JSC::Profiler::Database::removeFirstAtExitDatabase): * runtime/TypeProfilerLog.h: Source/WebCore: * bindings/objc/WebScriptObject.mm: (WebCore::getJSWrapper): (WebCore::addJSWrapper): (WebCore::removeJSWrapper): (WebCore::removeJSWrapperIfRetainCountOne): * platform/audio/mac/CARingBuffer.cpp: (WebCore::CARingBuffer::setCurrentFrameBounds): (WebCore::CARingBuffer::getCurrentFrameBounds): * platform/audio/mac/CARingBuffer.h: * platform/ios/wak/WAKWindow.mm: (-[WAKWindow setExposedScrollViewRect:]): (-[WAKWindow exposedScrollViewRect]): Source/WebKit2: * WebProcess/WebPage/EventDispatcher.cpp: (WebKit::EventDispatcher::clearQueuedTouchEventsForPage): (WebKit::EventDispatcher::getQueuedTouchEventsForPage): (WebKit::EventDispatcher::touchEvent): (WebKit::EventDispatcher::dispatchTouchEvents): * WebProcess/WebPage/EventDispatcher.h: * WebProcess/WebPage/ViewUpdateDispatcher.cpp: (WebKit::ViewUpdateDispatcher::visibleContentRectUpdate): (WebKit::ViewUpdateDispatcher::dispatchVisibleContentRectUpdate): * WebProcess/WebPage/ViewUpdateDispatcher.h: Source/WTF: A common idiom in WebKit is to use spinlocks. We use them because the lock acquisition overhead is lower than system locks and because they take dramatically less space than system locks. The speed and space advantages of spinlocks can be astonishing: an uncontended spinlock acquire is up to 10x faster and under microcontention - short critical section with two or more threads taking turns - spinlocks are up to 100x faster. Spinlocks take only 1 byte or 4 bytes depending on the flavor, while system locks take 64 bytes or more. Clearly, WebKit should continue to avoid system locks - they are just far too slow and far too big. But there is a problem with this idiom. System lock implementations will sleep a thread when it attempts to acquire a lock that is held, while spinlocks will cause the thread to burn CPU. In WebKit spinlocks, the thread will repeatedly call sched_yield(). This is awesome for microcontention, but awful when the lock will not be released for a while. In fact, when critical sections take tens of microseconds or more, the CPU time cost of our spinlocks is almost 100x more than the CPU time cost of a system lock. This case doesn't arise too frequently in our current uses of spinlocks, but that's probably because right now there are places where we make a conscious decision to use system locks - even though they use more memory and are slower - because we don't want to waste CPU cycles when a thread has to wait a while to acquire the lock. The solution is to just implement a modern adaptive mutex in WTF. Luckily, this isn't a new concept. This patch implements a mutex that is reminiscent of the kinds of low-overhead locks that JVMs use. The actual implementation here is inspired by some of the ideas from [1]. The idea is simple: the fast path is an inlined CAS to immediately acquire a lock that isn't held, the slow path tries some number of spins to acquire the lock, and if that fails, the thread is put on a queue and put to sleep. The queue is made up of statically allocated thread nodes and the lock itself is a tagged pointer: either it is just bits telling us the complete lock state (not held or held) or it is a pointer to the head of a queue of threads waiting to acquire the lock. This approach gives WTF::Lock three different levels of adaptation: an inlined fast path if the lock is not contended, a short burst of spinning for microcontention, and a full-blown queue for critical sections that are held for a long time. On a locking microbenchmark, this new Lock exhibits the following performance characteristics: - Lock+unlock on an uncontended no-op critical section: 2x slower than SpinLock and 3x faster than a system mutex. - Lock+unlock on a contended no-op critical section: 2x slower than SpinLock and 100x faster than a system mutex. - CPU time spent in lock() on a lock held for a while: same as system mutex, 90x less than a SpinLock. - Memory usage: sizeof(void*), so on 64-bit it's 8x less than a system mutex but 2x worse than a SpinLock. This patch replaces all uses of SpinLock with Lock, since our critical sections are not no-ops so if you do basically anything in your critical section, the Lock overhead will be invisible. Also, in all places where we used SpinLock, we could tolerate 8 bytes of overhead instead of 4. Performance benchmarking using JSC macrobenchmarks shows no difference, which is as it should be: the purpose of this change is to reduce CPU time wasted, not wallclock time. This patch doesn't replace any uses of ByteSpinLock, since we expect that the space benefits of having a lock that just uses a byte are still better than the CPU wastage benefits of Lock. But, this work will enable some future work to create locks that will fit in just 1.6 bits: https://bugs.webkit.org/show_bug.cgi?id=147665. Rolling this back in after fixing Lock::unlockSlow() for architectures that have a truly weak CAS. Since the Lock::unlock() fast path can go to slow path spuriously, it may go there even if there aren't any threads on the Lock's queue. So, unlockSlow() must be able to deal with the possibility of a null queue head. [1] http://www.filpizlo.com/papers/pizlo-pppj2011-fable.pdf * WTF.vcxproj/WTF.vcxproj: * WTF.xcodeproj/project.pbxproj: * benchmarks: Added. * benchmarks/LockSpeedTest.cpp: Added. (main): * wtf/Atomics.h: (WTF::Atomic::compareExchangeWeak): (WTF::Atomic::compareExchangeStrong): * wtf/CMakeLists.txt: * wtf/Lock.cpp: Added. (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): * wtf/Lock.h: Added. (WTF::LockBase::lock): (WTF::LockBase::unlock): (WTF::LockBase::isHeld): (WTF::LockBase::isLocked): (WTF::Lock::Lock): * wtf/MetaAllocator.cpp: (WTF::MetaAllocator::release): (WTF::MetaAllocatorHandle::shrink): (WTF::MetaAllocator::allocate): (WTF::MetaAllocator::currentStatistics): (WTF::MetaAllocator::addFreshFreeSpace): (WTF::MetaAllocator::debugFreeSpaceSize): * wtf/MetaAllocator.h: * wtf/SpinLock.h: * wtf/ThreadingPthreads.cpp: * wtf/ThreadingWin.cpp: * wtf/text/AtomicString.cpp: * wtf/text/AtomicStringImpl.cpp: (WTF::AtomicStringTableLocker::AtomicStringTableLocker): Tools: * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.vcxproj/TestWebKitAPI.vcxproj: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Lock.cpp: Added. (TestWebKitAPI::runLockTest): (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/165908@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188169 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-07 22:38:59 +00:00
} // namespace WTF
using WTF::Lock;
using WTF::LockHolder;