haikuwebkit/Source/WTF/wtf/LockAlgorithm.h

157 lines
5.0 KiB
C
Raw Permalink Normal View History

The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
/*
Use constexpr instead of const in symbol definitions that are obviously constexpr. https://bugs.webkit.org/show_bug.cgi?id=201879 Rubber-stamped by Joseph Pecoraro. Source/bmalloc: * bmalloc/AvailableMemory.cpp: * bmalloc/IsoTLS.h: * bmalloc/Map.h: * bmalloc/Mutex.cpp: (bmalloc::Mutex::lockSlowCase): * bmalloc/PerThread.h: * bmalloc/Vector.h: * bmalloc/Zone.h: Source/JavaScriptCore: const may require external storage (at the compiler's whim) though these currently do not. constexpr makes it clear that the value is a literal constant that can be inlined. In most cases in the code, when we say static const, we actually mean static constexpr. I'm changing the code to reflect this. * API/JSAPIValueWrapper.h: * API/JSCallbackConstructor.h: * API/JSCallbackObject.h: * API/JSContextRef.cpp: * API/JSWrapperMap.mm: * API/tests/CompareAndSwapTest.cpp: * API/tests/TypedArrayCTest.cpp: * API/tests/testapi.mm: (testObjectiveCAPIMain): * KeywordLookupGenerator.py: (Trie.printAsC): * assembler/ARMv7Assembler.h: * assembler/AssemblerBuffer.h: * assembler/AssemblerCommon.h: * assembler/MacroAssembler.h: * assembler/MacroAssemblerARM64.h: * assembler/MacroAssemblerARM64E.h: * assembler/MacroAssemblerARMv7.h: * assembler/MacroAssemblerCodeRef.h: * assembler/MacroAssemblerMIPS.h: * assembler/MacroAssemblerX86.h: * assembler/MacroAssemblerX86Common.h: (JSC::MacroAssemblerX86Common::absDouble): (JSC::MacroAssemblerX86Common::negateDouble): * assembler/MacroAssemblerX86_64.h: * assembler/X86Assembler.h: * b3/B3Bank.h: * b3/B3CheckSpecial.h: * b3/B3DuplicateTails.cpp: * b3/B3EliminateCommonSubexpressions.cpp: * b3/B3FixSSA.cpp: * b3/B3FoldPathConstants.cpp: * b3/B3InferSwitches.cpp: * b3/B3Kind.h: * b3/B3LowerToAir.cpp: * b3/B3NativeTraits.h: * b3/B3ReduceDoubleToFloat.cpp: * b3/B3ReduceLoopStrength.cpp: * b3/B3ReduceStrength.cpp: * b3/B3ValueKey.h: * b3/air/AirAllocateRegistersByGraphColoring.cpp: * b3/air/AirAllocateStackByGraphColoring.cpp: * b3/air/AirArg.h: * b3/air/AirCCallSpecial.h: * b3/air/AirEmitShuffle.cpp: * b3/air/AirFixObviousSpills.cpp: * b3/air/AirFormTable.h: * b3/air/AirLowerAfterRegAlloc.cpp: * b3/air/AirPrintSpecial.h: * b3/air/AirStackAllocation.cpp: * b3/air/AirTmp.h: * b3/testb3_6.cpp: (testInterpreter): * bytecode/AccessCase.cpp: * bytecode/CallLinkStatus.cpp: * bytecode/CallVariant.h: * bytecode/CodeBlock.h: * bytecode/CodeOrigin.h: * bytecode/DFGExitProfile.h: * bytecode/DirectEvalCodeCache.h: * bytecode/ExecutableToCodeBlockEdge.h: * bytecode/GetterSetterAccessCase.cpp: * bytecode/LazyOperandValueProfile.h: * bytecode/ObjectPropertyCondition.h: * bytecode/ObjectPropertyConditionSet.cpp: * bytecode/PolymorphicAccess.cpp: * bytecode/PropertyCondition.h: * bytecode/SpeculatedType.h: * bytecode/StructureStubInfo.cpp: * bytecode/UnlinkedCodeBlock.cpp: (JSC::UnlinkedCodeBlock::typeProfilerExpressionInfoForBytecodeOffset): * bytecode/UnlinkedCodeBlock.h: * bytecode/UnlinkedEvalCodeBlock.h: * bytecode/UnlinkedFunctionCodeBlock.h: * bytecode/UnlinkedFunctionExecutable.h: * bytecode/UnlinkedModuleProgramCodeBlock.h: * bytecode/UnlinkedProgramCodeBlock.h: * bytecode/ValueProfile.h: * bytecode/VirtualRegister.h: * bytecode/Watchpoint.h: * bytecompiler/BytecodeGenerator.h: * bytecompiler/Label.h: * bytecompiler/NodesCodegen.cpp: (JSC::ThisNode::emitBytecode): * bytecompiler/RegisterID.h: * debugger/Breakpoint.h: * debugger/DebuggerParseData.cpp: * debugger/DebuggerPrimitives.h: * debugger/DebuggerScope.h: * dfg/DFGAbstractHeap.h: * dfg/DFGAbstractValue.h: * dfg/DFGArgumentsEliminationPhase.cpp: * dfg/DFGByteCodeParser.cpp: * dfg/DFGCSEPhase.cpp: * dfg/DFGCommon.h: * dfg/DFGCompilationKey.h: * dfg/DFGDesiredGlobalProperty.h: * dfg/DFGEdgeDominates.h: * dfg/DFGEpoch.h: * dfg/DFGForAllKills.h: (JSC::DFG::forAllKilledNodesAtNodeIndex): * dfg/DFGGraph.cpp: (JSC::DFG::Graph::isLiveInBytecode): * dfg/DFGHeapLocation.h: * dfg/DFGInPlaceAbstractState.cpp: * dfg/DFGIntegerCheckCombiningPhase.cpp: * dfg/DFGIntegerRangeOptimizationPhase.cpp: * dfg/DFGInvalidationPointInjectionPhase.cpp: * dfg/DFGLICMPhase.cpp: * dfg/DFGLazyNode.h: * dfg/DFGMinifiedID.h: * dfg/DFGMovHintRemovalPhase.cpp: * dfg/DFGNodeFlowProjection.h: * dfg/DFGNodeType.h: * dfg/DFGObjectAllocationSinkingPhase.cpp: * dfg/DFGPhantomInsertionPhase.cpp: * dfg/DFGPromotedHeapLocation.h: * dfg/DFGPropertyTypeKey.h: * dfg/DFGPureValue.h: * dfg/DFGPutStackSinkingPhase.cpp: * dfg/DFGRegisterBank.h: * dfg/DFGSSAConversionPhase.cpp: * dfg/DFGSSALoweringPhase.cpp: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::compileDoubleRep): (JSC::DFG::compileClampDoubleToByte): (JSC::DFG::SpeculativeJIT::compileArithRounding): (JSC::DFG::compileArithPowIntegerFastPath): (JSC::DFG::SpeculativeJIT::compileArithPow): (JSC::DFG::SpeculativeJIT::emitBinarySwitchStringRecurse): * dfg/DFGStackLayoutPhase.cpp: * dfg/DFGStoreBarrierInsertionPhase.cpp: * dfg/DFGStrengthReductionPhase.cpp: * dfg/DFGStructureAbstractValue.h: * dfg/DFGVarargsForwardingPhase.cpp: * dfg/DFGVariableEventStream.cpp: (JSC::DFG::VariableEventStream::reconstruct const): * dfg/DFGWatchpointCollectionPhase.cpp: * disassembler/ARM64/A64DOpcode.h: * ftl/FTLLocation.h: * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compileArithRandom): * ftl/FTLSlowPathCall.cpp: * ftl/FTLSlowPathCallKey.h: * heap/CellContainer.h: * heap/CellState.h: * heap/ConservativeRoots.h: * heap/GCSegmentedArray.h: * heap/HandleBlock.h: * heap/Heap.cpp: (JSC::Heap::updateAllocationLimits): * heap/Heap.h: * heap/HeapSnapshot.h: * heap/HeapUtil.h: (JSC::HeapUtil::findGCObjectPointersForMarking): * heap/IncrementalSweeper.cpp: * heap/LargeAllocation.h: * heap/MarkedBlock.cpp: * heap/Strong.h: * heap/VisitRaceKey.h: * heap/Weak.h: * heap/WeakBlock.h: * inspector/JSInjectedScriptHost.h: * inspector/JSInjectedScriptHostPrototype.h: * inspector/JSJavaScriptCallFrame.h: * inspector/JSJavaScriptCallFramePrototype.h: * inspector/agents/InspectorConsoleAgent.cpp: * inspector/agents/InspectorRuntimeAgent.cpp: (Inspector::InspectorRuntimeAgent::getRuntimeTypesForVariablesAtOffsets): * inspector/scripts/codegen/generate_cpp_protocol_types_header.py: (CppProtocolTypesHeaderGenerator._generate_versions): * inspector/scripts/tests/generic/expected/version.json-result: * interpreter/Interpreter.h: * interpreter/ShadowChicken.cpp: * jit/BinarySwitch.cpp: * jit/CallFrameShuffler.h: * jit/ExecutableAllocator.h: * jit/FPRInfo.h: * jit/GPRInfo.h: * jit/ICStats.h: * jit/JITThunks.h: * jit/Reg.h: * jit/RegisterSet.h: * jit/TempRegisterSet.h: * jsc.cpp: * parser/ASTBuilder.h: * parser/Nodes.h: * parser/SourceCodeKey.h: * parser/SyntaxChecker.h: * parser/VariableEnvironment.h: * profiler/ProfilerOrigin.h: * profiler/ProfilerOriginStack.h: * profiler/ProfilerUID.h: * runtime/AbstractModuleRecord.cpp: * runtime/ArrayBufferNeuteringWatchpointSet.h: * runtime/ArrayConstructor.h: * runtime/ArrayConventions.h: * runtime/ArrayIteratorPrototype.h: * runtime/ArrayPrototype.cpp: (JSC::setLength): * runtime/AsyncFromSyncIteratorPrototype.h: * runtime/AsyncGeneratorFunctionPrototype.h: * runtime/AsyncGeneratorPrototype.h: * runtime/AsyncIteratorPrototype.h: * runtime/AtomicsObject.cpp: * runtime/BigIntConstructor.h: * runtime/BigIntPrototype.h: * runtime/BooleanPrototype.h: * runtime/ClonedArguments.h: * runtime/CodeCache.h: * runtime/ControlFlowProfiler.h: * runtime/CustomGetterSetter.h: * runtime/DateConstructor.h: * runtime/DatePrototype.h: * runtime/DefinePropertyAttributes.h: * runtime/ErrorPrototype.h: * runtime/EvalExecutable.h: * runtime/Exception.h: * runtime/ExceptionHelpers.cpp: (JSC::invalidParameterInSourceAppender): (JSC::invalidParameterInstanceofSourceAppender): * runtime/ExceptionHelpers.h: * runtime/ExecutableBase.h: * runtime/FunctionExecutable.h: * runtime/FunctionRareData.h: * runtime/GeneratorPrototype.h: * runtime/GenericArguments.h: * runtime/GenericOffset.h: * runtime/GetPutInfo.h: * runtime/GetterSetter.h: * runtime/GlobalExecutable.h: * runtime/Identifier.h: * runtime/InspectorInstrumentationObject.h: * runtime/InternalFunction.h: * runtime/IntlCollatorConstructor.h: * runtime/IntlCollatorPrototype.h: * runtime/IntlDateTimeFormatConstructor.h: * runtime/IntlDateTimeFormatPrototype.h: * runtime/IntlNumberFormatConstructor.h: * runtime/IntlNumberFormatPrototype.h: * runtime/IntlObject.h: * runtime/IntlPluralRulesConstructor.h: * runtime/IntlPluralRulesPrototype.h: * runtime/IteratorPrototype.h: * runtime/JSArray.cpp: (JSC::JSArray::tryCreateUninitializedRestricted): * runtime/JSArray.h: * runtime/JSArrayBuffer.h: * runtime/JSArrayBufferView.h: * runtime/JSBigInt.h: * runtime/JSCJSValue.h: * runtime/JSCell.h: * runtime/JSCustomGetterSetterFunction.h: * runtime/JSDataView.h: * runtime/JSDataViewPrototype.h: * runtime/JSDestructibleObject.h: * runtime/JSFixedArray.h: * runtime/JSGenericTypedArrayView.h: * runtime/JSGlobalLexicalEnvironment.h: * runtime/JSGlobalObject.h: * runtime/JSImmutableButterfly.h: * runtime/JSInternalPromiseConstructor.h: * runtime/JSInternalPromiseDeferred.h: * runtime/JSInternalPromisePrototype.h: * runtime/JSLexicalEnvironment.h: * runtime/JSModuleEnvironment.h: * runtime/JSModuleLoader.h: * runtime/JSModuleNamespaceObject.h: * runtime/JSNonDestructibleProxy.h: * runtime/JSONObject.cpp: * runtime/JSONObject.h: * runtime/JSObject.h: * runtime/JSPromiseConstructor.h: * runtime/JSPromiseDeferred.h: * runtime/JSPromisePrototype.h: * runtime/JSPropertyNameEnumerator.h: * runtime/JSProxy.h: * runtime/JSScope.h: * runtime/JSScriptFetchParameters.h: * runtime/JSScriptFetcher.h: * runtime/JSSegmentedVariableObject.h: * runtime/JSSourceCode.h: * runtime/JSString.cpp: * runtime/JSString.h: * runtime/JSSymbolTableObject.h: * runtime/JSTemplateObjectDescriptor.h: * runtime/JSTypeInfo.h: * runtime/MapPrototype.h: * runtime/MinimumReservedZoneSize.h: * runtime/ModuleProgramExecutable.h: * runtime/NativeExecutable.h: * runtime/NativeFunction.h: * runtime/NativeStdFunctionCell.h: * runtime/NumberConstructor.h: * runtime/NumberPrototype.h: * runtime/ObjectConstructor.h: * runtime/ObjectPrototype.h: * runtime/ProgramExecutable.h: * runtime/PromiseDeferredTimer.cpp: * runtime/PropertyMapHashTable.h: * runtime/PropertyNameArray.h: (JSC::PropertyNameArray::add): * runtime/PrototypeKey.h: * runtime/ProxyConstructor.h: * runtime/ProxyObject.cpp: (JSC::ProxyObject::performGetOwnPropertyNames): * runtime/ProxyRevoke.h: * runtime/ReflectObject.h: * runtime/RegExp.h: * runtime/RegExpCache.h: * runtime/RegExpConstructor.h: * runtime/RegExpKey.h: * runtime/RegExpObject.h: * runtime/RegExpPrototype.h: * runtime/RegExpStringIteratorPrototype.h: * runtime/SamplingProfiler.cpp: * runtime/ScopedArgumentsTable.h: * runtime/ScriptExecutable.h: * runtime/SetPrototype.h: * runtime/SmallStrings.h: * runtime/SparseArrayValueMap.h: * runtime/StringConstructor.h: * runtime/StringIteratorPrototype.h: * runtime/StringObject.h: * runtime/StringPrototype.h: * runtime/Structure.h: * runtime/StructureChain.h: * runtime/StructureRareData.h: * runtime/StructureTransitionTable.h: * runtime/Symbol.h: * runtime/SymbolConstructor.h: * runtime/SymbolPrototype.h: * runtime/SymbolTable.h: * runtime/TemplateObjectDescriptor.h: * runtime/TypeProfiler.cpp: * runtime/TypeProfiler.h: * runtime/TypeProfilerLog.cpp: * runtime/VarOffset.h: * testRegExp.cpp: * tools/HeapVerifier.cpp: (JSC::HeapVerifier::checkIfRecorded): * tools/JSDollarVM.cpp: * wasm/WasmB3IRGenerator.cpp: * wasm/WasmBBQPlan.cpp: * wasm/WasmFaultSignalHandler.cpp: * wasm/WasmFunctionParser.h: * wasm/WasmOMGForOSREntryPlan.cpp: * wasm/WasmOMGPlan.cpp: * wasm/WasmPlan.cpp: * wasm/WasmSignature.cpp: * wasm/WasmSignature.h: * wasm/WasmWorklist.cpp: * wasm/js/JSWebAssembly.h: * wasm/js/JSWebAssemblyCodeBlock.h: * wasm/js/WebAssemblyCompileErrorConstructor.h: * wasm/js/WebAssemblyCompileErrorPrototype.h: * wasm/js/WebAssemblyFunction.h: * wasm/js/WebAssemblyInstanceConstructor.h: * wasm/js/WebAssemblyInstancePrototype.h: * wasm/js/WebAssemblyLinkErrorConstructor.h: * wasm/js/WebAssemblyLinkErrorPrototype.h: * wasm/js/WebAssemblyMemoryConstructor.h: * wasm/js/WebAssemblyMemoryPrototype.h: * wasm/js/WebAssemblyModuleConstructor.h: * wasm/js/WebAssemblyModulePrototype.h: * wasm/js/WebAssemblyRuntimeErrorConstructor.h: * wasm/js/WebAssemblyRuntimeErrorPrototype.h: * wasm/js/WebAssemblyTableConstructor.h: * wasm/js/WebAssemblyTablePrototype.h: * wasm/js/WebAssemblyToJSCallee.h: * yarr/Yarr.h: * yarr/YarrParser.h: * yarr/generateYarrCanonicalizeUnicode: Source/WebCore: No new tests. Covered by existing tests. * bindings/js/JSDOMConstructorBase.h: * bindings/js/JSDOMWindowProperties.h: * bindings/scripts/CodeGeneratorJS.pm: (GenerateHeader): (GeneratePrototypeDeclaration): * bindings/scripts/test/JS/JSTestActiveDOMObject.h: * bindings/scripts/test/JS/JSTestEnabledBySetting.h: * bindings/scripts/test/JS/JSTestEnabledForContext.h: * bindings/scripts/test/JS/JSTestEventTarget.h: * bindings/scripts/test/JS/JSTestGlobalObject.h: * bindings/scripts/test/JS/JSTestIndexedSetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestIndexedSetterThrowingException.h: * bindings/scripts/test/JS/JSTestIndexedSetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedAndIndexedSetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedAndIndexedSetterThrowingException.h: * bindings/scripts/test/JS/JSTestNamedAndIndexedSetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedDeleterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedDeleterThrowingException.h: * bindings/scripts/test/JS/JSTestNamedDeleterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedDeleterWithIndexedGetter.h: * bindings/scripts/test/JS/JSTestNamedGetterCallWith.h: * bindings/scripts/test/JS/JSTestNamedGetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedGetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedSetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedSetterThrowingException.h: * bindings/scripts/test/JS/JSTestNamedSetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedSetterWithIndexedGetter.h: * bindings/scripts/test/JS/JSTestNamedSetterWithIndexedGetterAndSetter.h: * bindings/scripts/test/JS/JSTestNamedSetterWithOverrideBuiltins.h: * bindings/scripts/test/JS/JSTestNamedSetterWithUnforgableProperties.h: * bindings/scripts/test/JS/JSTestNamedSetterWithUnforgablePropertiesAndOverrideBuiltins.h: * bindings/scripts/test/JS/JSTestObj.h: * bindings/scripts/test/JS/JSTestOverrideBuiltins.h: * bindings/scripts/test/JS/JSTestPluginInterface.h: * bindings/scripts/test/JS/JSTestTypedefs.h: * bridge/objc/objc_runtime.h: * bridge/runtime_array.h: * bridge/runtime_method.h: * bridge/runtime_object.h: Source/WebKit: * WebProcess/Plugins/Netscape/JSNPObject.h: Source/WTF: * wtf/Assertions.cpp: * wtf/AutomaticThread.cpp: * wtf/BitVector.h: * wtf/Bitmap.h: * wtf/BloomFilter.h: * wtf/Brigand.h: * wtf/CheckedArithmetic.h: * wtf/CrossThreadCopier.h: * wtf/CurrentTime.cpp: * wtf/DataLog.cpp: * wtf/DateMath.cpp: (WTF::daysFrom1970ToYear): * wtf/DeferrableRefCounted.h: * wtf/GetPtr.h: * wtf/HashFunctions.h: * wtf/HashMap.h: * wtf/HashTable.h: * wtf/HashTraits.h: * wtf/JSONValues.cpp: * wtf/JSONValues.h: * wtf/ListHashSet.h: * wtf/Lock.h: * wtf/LockAlgorithm.h: * wtf/LockAlgorithmInlines.h: (WTF::Hooks>::lockSlow): * wtf/Logger.h: * wtf/LoggerHelper.h: (WTF::LoggerHelper::childLogIdentifier const): * wtf/MainThread.cpp: * wtf/MetaAllocatorPtr.h: * wtf/MonotonicTime.h: * wtf/NaturalLoops.h: (WTF::NaturalLoops::NaturalLoops): * wtf/ObjectIdentifier.h: * wtf/RAMSize.cpp: * wtf/Ref.h: * wtf/RefPtr.h: * wtf/RetainPtr.h: * wtf/SchedulePair.h: * wtf/StackShot.h: * wtf/StdLibExtras.h: * wtf/TinyPtrSet.h: * wtf/URL.cpp: * wtf/URLHash.h: * wtf/URLParser.cpp: (WTF::URLParser::defaultPortForProtocol): * wtf/Vector.h: * wtf/VectorTraits.h: * wtf/WallTime.h: * wtf/WeakHashSet.h: * wtf/WordLock.h: * wtf/cocoa/CPUTimeCocoa.cpp: * wtf/cocoa/MemoryPressureHandlerCocoa.mm: * wtf/persistence/PersistentDecoder.h: * wtf/persistence/PersistentEncoder.h: * wtf/text/AtomStringHash.h: * wtf/text/CString.h: * wtf/text/StringBuilder.cpp: (WTF::expandedCapacity): * wtf/text/StringHash.h: * wtf/text/StringImpl.h: * wtf/text/StringToIntegerConversion.h: (WTF::toIntegralType): * wtf/text/SymbolRegistry.h: * wtf/text/TextStream.cpp: (WTF::hasFractions): * wtf/text/WTFString.h: * wtf/text/cocoa/TextBreakIteratorInternalICUCocoa.cpp: Canonical link: https://commits.webkit.org/215538@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@250005 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2019-09-18 00:36:19 +00:00
* Copyright (C) 2015-2019 Apple Inc. All rights reserved.
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
* EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
* PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR
* CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
* EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
* PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
* PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
* OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
Use pragma once in WTF https://bugs.webkit.org/show_bug.cgi?id=190527 Reviewed by Chris Dumez. Source/WTF: We also need to consistently include wtf headers from within wtf so we can build wtf without symbol redefinition errors from including the copy in Source and the copy in the build directory. * wtf/ASCIICType.h: * wtf/Assertions.cpp: * wtf/Assertions.h: * wtf/Atomics.h: * wtf/AutomaticThread.cpp: * wtf/AutomaticThread.h: * wtf/BackwardsGraph.h: * wtf/Bag.h: * wtf/BagToHashMap.h: * wtf/BitVector.cpp: * wtf/BitVector.h: * wtf/Bitmap.h: * wtf/BloomFilter.h: * wtf/Box.h: * wtf/BubbleSort.h: * wtf/BumpPointerAllocator.h: * wtf/ByteOrder.h: * wtf/CPUTime.cpp: * wtf/CallbackAggregator.h: * wtf/CheckedArithmetic.h: * wtf/CheckedBoolean.h: * wtf/ClockType.cpp: * wtf/ClockType.h: * wtf/CommaPrinter.h: * wtf/CompilationThread.cpp: * wtf/CompilationThread.h: * wtf/Compiler.h: * wtf/ConcurrentPtrHashSet.cpp: * wtf/ConcurrentVector.h: * wtf/Condition.h: * wtf/CountingLock.cpp: * wtf/CrossThreadTaskHandler.cpp: * wtf/CryptographicUtilities.cpp: * wtf/CryptographicUtilities.h: * wtf/CryptographicallyRandomNumber.cpp: * wtf/CryptographicallyRandomNumber.h: * wtf/CurrentTime.cpp: * wtf/DataLog.cpp: * wtf/DataLog.h: * wtf/DateMath.cpp: * wtf/DateMath.h: * wtf/DecimalNumber.cpp: * wtf/DecimalNumber.h: * wtf/Deque.h: * wtf/DisallowCType.h: * wtf/Dominators.h: * wtf/DoublyLinkedList.h: * wtf/FastBitVector.cpp: * wtf/FastMalloc.cpp: * wtf/FastMalloc.h: * wtf/FeatureDefines.h: * wtf/FilePrintStream.cpp: * wtf/FilePrintStream.h: * wtf/FlipBytes.h: * wtf/FunctionDispatcher.cpp: * wtf/FunctionDispatcher.h: * wtf/GetPtr.h: * wtf/Gigacage.cpp: * wtf/GlobalVersion.cpp: * wtf/GraphNodeWorklist.h: * wtf/GregorianDateTime.cpp: * wtf/GregorianDateTime.h: * wtf/HashFunctions.h: * wtf/HashMap.h: * wtf/HashMethod.h: * wtf/HashSet.h: * wtf/HashTable.cpp: * wtf/HashTraits.h: * wtf/Indenter.h: * wtf/IndexSparseSet.h: * wtf/InlineASM.h: * wtf/Insertion.h: * wtf/IteratorAdaptors.h: * wtf/IteratorRange.h: * wtf/JSONValues.cpp: * wtf/JSValueMalloc.cpp: * wtf/LEBDecoder.h: * wtf/Language.cpp: * wtf/ListDump.h: * wtf/Lock.cpp: * wtf/Lock.h: * wtf/LockAlgorithm.h: * wtf/LockedPrintStream.cpp: * wtf/Locker.h: * wtf/MD5.cpp: * wtf/MD5.h: * wtf/MainThread.cpp: * wtf/MainThread.h: * wtf/MallocPtr.h: * wtf/MathExtras.h: * wtf/MediaTime.cpp: * wtf/MediaTime.h: * wtf/MemoryPressureHandler.cpp: * wtf/MessageQueue.h: * wtf/MetaAllocator.cpp: * wtf/MetaAllocator.h: * wtf/MetaAllocatorHandle.h: * wtf/MonotonicTime.cpp: * wtf/MonotonicTime.h: * wtf/NakedPtr.h: * wtf/NoLock.h: * wtf/NoTailCalls.h: * wtf/Noncopyable.h: * wtf/NumberOfCores.cpp: * wtf/NumberOfCores.h: * wtf/OSAllocator.h: * wtf/OSAllocatorPosix.cpp: * wtf/OSRandomSource.cpp: * wtf/OSRandomSource.h: * wtf/ObjcRuntimeExtras.h: * wtf/OrderMaker.h: * wtf/PackedIntVector.h: * wtf/PageAllocation.h: * wtf/PageBlock.cpp: * wtf/PageBlock.h: * wtf/PageReservation.h: * wtf/ParallelHelperPool.cpp: * wtf/ParallelHelperPool.h: * wtf/ParallelJobs.h: * wtf/ParallelJobsLibdispatch.h: * wtf/ParallelVectorIterator.h: * wtf/ParkingLot.cpp: * wtf/ParkingLot.h: * wtf/Platform.h: * wtf/PointerComparison.h: * wtf/Poisoned.cpp: * wtf/PrintStream.cpp: * wtf/PrintStream.h: * wtf/ProcessID.h: * wtf/ProcessPrivilege.cpp: * wtf/RAMSize.cpp: * wtf/RAMSize.h: * wtf/RandomDevice.cpp: * wtf/RandomNumber.cpp: * wtf/RandomNumber.h: * wtf/RandomNumberSeed.h: * wtf/RangeSet.h: * wtf/RawPointer.h: * wtf/ReadWriteLock.cpp: * wtf/RedBlackTree.h: * wtf/Ref.h: * wtf/RefCountedArray.h: * wtf/RefCountedLeakCounter.cpp: * wtf/RefCountedLeakCounter.h: * wtf/RefCounter.h: * wtf/RefPtr.h: * wtf/RetainPtr.h: * wtf/RunLoop.cpp: * wtf/RunLoop.h: * wtf/RunLoopTimer.h: * wtf/RunLoopTimerCF.cpp: * wtf/SHA1.cpp: * wtf/SHA1.h: * wtf/SaturatedArithmetic.h: (saturatedSubtraction): * wtf/SchedulePair.h: * wtf/SchedulePairCF.cpp: * wtf/SchedulePairMac.mm: * wtf/ScopedLambda.h: * wtf/Seconds.cpp: * wtf/Seconds.h: * wtf/SegmentedVector.h: * wtf/SentinelLinkedList.h: * wtf/SharedTask.h: * wtf/SimpleStats.h: * wtf/SingleRootGraph.h: * wtf/SinglyLinkedList.h: * wtf/SixCharacterHash.cpp: * wtf/SixCharacterHash.h: * wtf/SmallPtrSet.h: * wtf/Spectrum.h: * wtf/StackBounds.cpp: * wtf/StackBounds.h: * wtf/StackStats.cpp: * wtf/StackStats.h: * wtf/StackTrace.cpp: * wtf/StdLibExtras.h: * wtf/StreamBuffer.h: * wtf/StringHashDumpContext.h: * wtf/StringPrintStream.cpp: * wtf/StringPrintStream.h: * wtf/ThreadGroup.cpp: * wtf/ThreadMessage.cpp: * wtf/ThreadSpecific.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPrimitives.h: * wtf/ThreadingPthreads.cpp: * wtf/TimeWithDynamicClockType.cpp: * wtf/TimeWithDynamicClockType.h: * wtf/TimingScope.cpp: * wtf/TinyLRUCache.h: * wtf/TinyPtrSet.h: * wtf/TriState.h: * wtf/TypeCasts.h: * wtf/UUID.cpp: * wtf/UnionFind.h: * wtf/VMTags.h: * wtf/ValueCheck.h: * wtf/Vector.h: * wtf/VectorTraits.h: * wtf/WallTime.cpp: * wtf/WallTime.h: * wtf/WeakPtr.h: * wtf/WeakRandom.h: * wtf/WordLock.cpp: * wtf/WordLock.h: * wtf/WorkQueue.cpp: * wtf/WorkQueue.h: * wtf/WorkerPool.cpp: * wtf/cf/LanguageCF.cpp: * wtf/cf/RunLoopCF.cpp: * wtf/cocoa/Entitlements.mm: * wtf/cocoa/MachSendRight.cpp: * wtf/cocoa/MainThreadCocoa.mm: * wtf/cocoa/MemoryFootprintCocoa.cpp: * wtf/cocoa/WorkQueueCocoa.cpp: * wtf/dtoa.cpp: * wtf/dtoa.h: * wtf/ios/WebCoreThread.cpp: * wtf/ios/WebCoreThread.h: * wtf/mac/AppKitCompatibilityDeclarations.h: * wtf/mac/DeprecatedSymbolsUsedBySafari.mm: * wtf/mbmalloc.cpp: * wtf/persistence/PersistentCoders.cpp: * wtf/persistence/PersistentDecoder.cpp: * wtf/persistence/PersistentEncoder.cpp: * wtf/spi/cf/CFBundleSPI.h: * wtf/spi/darwin/CommonCryptoSPI.h: * wtf/text/ASCIIFastPath.h: * wtf/text/ASCIILiteral.cpp: * wtf/text/AtomicString.cpp: * wtf/text/AtomicString.h: * wtf/text/AtomicStringHash.h: * wtf/text/AtomicStringImpl.cpp: * wtf/text/AtomicStringImpl.h: * wtf/text/AtomicStringTable.cpp: * wtf/text/AtomicStringTable.h: * wtf/text/Base64.cpp: * wtf/text/CString.cpp: * wtf/text/CString.h: * wtf/text/ConversionMode.h: * wtf/text/ExternalStringImpl.cpp: * wtf/text/IntegerToStringConversion.h: * wtf/text/LChar.h: * wtf/text/LineEnding.cpp: * wtf/text/StringBuffer.h: * wtf/text/StringBuilder.cpp: * wtf/text/StringBuilder.h: * wtf/text/StringBuilderJSON.cpp: * wtf/text/StringCommon.h: * wtf/text/StringConcatenate.h: * wtf/text/StringHash.h: * wtf/text/StringImpl.cpp: * wtf/text/StringImpl.h: * wtf/text/StringOperators.h: * wtf/text/StringView.cpp: * wtf/text/StringView.h: * wtf/text/SymbolImpl.cpp: * wtf/text/SymbolRegistry.cpp: * wtf/text/SymbolRegistry.h: * wtf/text/TextBreakIterator.cpp: * wtf/text/TextBreakIterator.h: * wtf/text/TextBreakIteratorInternalICU.h: * wtf/text/TextPosition.h: * wtf/text/TextStream.cpp: * wtf/text/UniquedStringImpl.h: * wtf/text/WTFString.cpp: * wtf/text/WTFString.h: * wtf/text/cocoa/StringCocoa.mm: * wtf/text/cocoa/StringViewCocoa.mm: * wtf/text/cocoa/TextBreakIteratorInternalICUCocoa.cpp: * wtf/text/icu/UTextProvider.cpp: * wtf/text/icu/UTextProvider.h: * wtf/text/icu/UTextProviderLatin1.cpp: * wtf/text/icu/UTextProviderLatin1.h: * wtf/text/icu/UTextProviderUTF16.cpp: * wtf/text/icu/UTextProviderUTF16.h: * wtf/threads/BinarySemaphore.cpp: * wtf/threads/BinarySemaphore.h: * wtf/threads/Signals.cpp: * wtf/unicode/CharacterNames.h: * wtf/unicode/Collator.h: * wtf/unicode/CollatorDefault.cpp: * wtf/unicode/UTF8.cpp: * wtf/unicode/UTF8.h: Tools: Put WorkQueue in namespace DRT so it does not conflict with WTF::WorkQueue. * DumpRenderTree/TestRunner.cpp: (TestRunner::queueLoadHTMLString): (TestRunner::queueLoadAlternateHTMLString): (TestRunner::queueBackNavigation): (TestRunner::queueForwardNavigation): (TestRunner::queueLoadingScript): (TestRunner::queueNonLoadingScript): (TestRunner::queueReload): * DumpRenderTree/WorkQueue.cpp: (WorkQueue::singleton): Deleted. (WorkQueue::WorkQueue): Deleted. (WorkQueue::queue): Deleted. (WorkQueue::dequeue): Deleted. (WorkQueue::count): Deleted. (WorkQueue::clear): Deleted. (WorkQueue::processWork): Deleted. * DumpRenderTree/WorkQueue.h: (WorkQueue::setFrozen): Deleted. * DumpRenderTree/WorkQueueItem.h: * DumpRenderTree/mac/DumpRenderTree.mm: (runTest): * DumpRenderTree/mac/FrameLoadDelegate.mm: (-[FrameLoadDelegate processWork:]): (-[FrameLoadDelegate webView:locationChangeDone:forDataSource:]): * DumpRenderTree/mac/TestRunnerMac.mm: (TestRunner::notifyDone): (TestRunner::forceImmediateCompletion): (TestRunner::queueLoad): * DumpRenderTree/win/DumpRenderTree.cpp: (runTest): * DumpRenderTree/win/FrameLoadDelegate.cpp: (FrameLoadDelegate::processWork): (FrameLoadDelegate::locationChangeDone): * DumpRenderTree/win/TestRunnerWin.cpp: (TestRunner::notifyDone): (TestRunner::forceImmediateCompletion): (TestRunner::queueLoad): Canonical link: https://commits.webkit.org/205473@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@237099 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2018-10-15 14:24:49 +00:00
#pragma once
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
#include <wtf/Atomics.h>
#include <wtf/Compiler.h>
namespace WTF {
The concurrent GC should have a timeslicing controller https://bugs.webkit.org/show_bug.cgi?id=164783 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This adds a simple control system for deciding when the collector should let the mutator run and when it should stop the mutator. We definitely have to stop the mutator during certain collector phases, but during marking - which takes the most time - we can go either way. Normally we want to let the mutator run, but if the heap size starts to grow then we have to stop the mutator just to make sure it doesn't get too far ahead of the collector. That could lead to memory exhaustion, so it's better to just stop in that case. The controller tries to never stop the mutator for longer than short timeslices. It slices on a 2ms period (configurable via Options). The amount of that period that the collector spends with the mutator stopped is determined by the fraction of the collector's concurrent headroom that has been allocated over. The headroom is currently configured at 50% of what was allocated before the collector started. This moves a bunch of parameters into Options so that it's easier to play with different configurations. I tried these different values for the period: 1ms: 30% worse than 2ms on splay-latency. 2ms: best score on splay-latency: the tick time above the 99.5% percentile is <2ms. 3ms: 40% worse than 2ms on splay-latency. 4ms: 40% worse than 2ms on splay-latency. I also tried 100% headroom as an alternate to 50% and found it to be a worse. This patch is a 2x improvement on splay-latency with the default parameters and concurrent GC enabled. Prior to this change, the GC didn't have a good bound on its pause times, which would cause these problems. Concurrent GC is now 5.6x better on splay-latency than no concurrent GC. * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::markToFixpoint): (JSC::Heap::collectInThread): * runtime/Options.h: Source/WTF: * wtf/LockAlgorithm.h: Added some comments. * wtf/Seconds.h: Added support for modulo. It's necessary for timeslicing. (WTF::Seconds::operator%): (WTF::Seconds::operator%=): Canonical link: https://commits.webkit.org/182464@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208750 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 21:15:04 +00:00
// This is the algorithm used by WTF::Lock. You can use it to project one lock onto any atomic
// field. The limit of one lock is due to the use of the field's address as a key to find the lock's
// queue.
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
GC constraint solving should be parallel https://bugs.webkit.org/show_bug.cgi?id=179934 Reviewed by JF Bastien. PerformanceTests: Added a version of splay that measures latency in a way that run-jsc-benchmarks groks. * Octane/splay.js: Added. (this.Setup.setup.setup): (this.TearDown.tearDown.tearDown): (Benchmark): (BenchmarkResult): (BenchmarkResult.prototype.valueOf): (BenchmarkSuite): (alert): (Math.random): (BenchmarkSuite.ResetRNG): (RunStep): (BenchmarkSuite.RunSuites): (BenchmarkSuite.CountBenchmarks): (BenchmarkSuite.GeometricMean): (BenchmarkSuite.GeometricMeanTime): (BenchmarkSuite.AverageAbovePercentile): (BenchmarkSuite.GeometricMeanLatency): (BenchmarkSuite.FormatScore): (BenchmarkSuite.prototype.NotifyStep): (BenchmarkSuite.prototype.NotifyResult): (BenchmarkSuite.prototype.NotifyError): (BenchmarkSuite.prototype.RunSingleBenchmark): (RunNextSetup): (RunNextBenchmark): (RunNextTearDown): (BenchmarkSuite.prototype.RunStep): (GeneratePayloadTree): (GenerateKey): (SplayUpdateStats): (InsertNewNode): (SplaySetup): (SplayTearDown): (SplayRun): (SplayTree): (SplayTree.prototype.isEmpty): (SplayTree.prototype.insert): (SplayTree.prototype.remove): (SplayTree.prototype.find): (SplayTree.prototype.findMax): (SplayTree.prototype.findGreatestLessThan): (SplayTree.prototype.exportKeys): (SplayTree.prototype.splay_): (SplayTree.Node): (SplayTree.Node.prototype.traverse_): (report): (start): Source/JavaScriptCore: This makes it possible to do constraint solving in parallel. This looks like a 1% Speedometer speed-up. It's more than 1% on trunk-Speedometer. The constraint solver supports running constraints in parallel in two different ways: - Run multiple constraints in parallel to each other. This only works for constraints that can tolerate other constraints running concurrently to them (constraint.concurrency() == ConstraintConcurrency::Concurrent). This is the most basic kind of parallelism that the constraint solver supports. All constraints except the JSC SPI constraints are concurrent. We could probably make them concurrent, but I'm playing it safe for now. - A constraint can create parallel work for itself, which the constraint solver will interleave with other stuff. A constraint can report that it has parallel work by returning ConstraintParallelism::Parallel from its executeImpl() function. Then the solver will allow that constraint's doParallelWorkImpl() function to run on as many GC marker threads as are available, for as long as that function wants to run. It's not possible to have a non-concurrent constraint that creates parallel work. The parallelism is implemented in terms of the existing GC marker threads. This turns out to be most natural for two reasons: - No need to start any other threads. - The constraints all want to be passed a SlotVisitor. Running on the marker threads means having access to those threads' SlotVisitors. Also, it means less load balancing. The solver will create work on each marking thread's SlotVisitor. When the solver is done "stealing" a marker thread, that thread will have work it can start doing immediately. Before this change, we had to contribute the work found by the constraint solver to the global worklist so that it could be distributed to the marker threads by load balancing. This change probably helps to avoid that load balancing step. A lot of this change is about making it easy to iterate GC data structures in parallel. This change makes almost all constraints parallel-enabled, but only the DOM's output constraint uses the parallel work API. That constraint iterates the marked cells in two subspaces. This change makes it very easy to compose parallel iterators over subspaces, allocators, blocks, and cells. The marked cell parallel iterator is composed out of parallel iterators for the others. A parallel iterator is just an iterator that can do an atomic next() very quickly. We abstract them using RefPtr<SharedTask<...()>>, where ... is the type returned from the iterator. We know it's done when it returns a falsish version of ... (in the current code, that's always a pointer type, so done is indicated by null). * API/JSMarkingConstraintPrivate.cpp: (JSContextGroupAddMarkingConstraint): * API/JSVirtualMachine.mm: (scanExternalObjectGraph): (scanExternalRememberedSet): * JavaScriptCore.xcodeproj/project.pbxproj: * Sources.txt: * bytecode/AccessCase.cpp: (JSC::AccessCase::propagateTransitions const): * bytecode/CodeBlock.cpp: (JSC::CodeBlock::visitWeakly): (JSC::CodeBlock::shouldJettisonDueToOldAge): (JSC::shouldMarkTransition): (JSC::CodeBlock::propagateTransitions): (JSC::CodeBlock::determineLiveness): * dfg/DFGWorklist.cpp: * ftl/FTLCompile.cpp: (JSC::FTL::compile): * heap/ConstraintParallelism.h: Added. (WTF::printInternal): * heap/Heap.cpp: (JSC::Heap::Heap): (JSC::Heap::addToRememberedSet): (JSC::Heap::runFixpointPhase): (JSC::Heap::stopThePeriphery): (JSC::Heap::resumeThePeriphery): (JSC::Heap::addCoreConstraints): (JSC::Heap::setBonusVisitorTask): (JSC::Heap::runTaskInParallel): (JSC::Heap::forEachSlotVisitor): Deleted. * heap/Heap.h: (JSC::Heap::worldIsRunning const): (JSC::Heap::runFunctionInParallel): * heap/HeapInlines.h: (JSC::Heap::worldIsStopped const): (JSC::Heap::isMarked): (JSC::Heap::incrementDeferralDepth): (JSC::Heap::decrementDeferralDepth): (JSC::Heap::decrementDeferralDepthAndGCIfNeeded): (JSC::Heap::forEachSlotVisitor): (JSC::Heap::collectorBelievesThatTheWorldIsStopped const): Deleted. (JSC::Heap::isMarkedConcurrently): Deleted. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::appendNode): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): Deleted. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::parallelNotEmptyBlockSource): * heap/MarkedAllocator.h: * heap/MarkedBlock.h: (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::areMarksStaleWithDependency): Deleted. (JSC::MarkedBlock::isMarkedConcurrently): Deleted. * heap/MarkedSpace.h: (JSC::MarkedSpace::activeWeakSetsBegin): (JSC::MarkedSpace::activeWeakSetsEnd): (JSC::MarkedSpace::newActiveWeakSetsBegin): (JSC::MarkedSpace::newActiveWeakSetsEnd): * heap/MarkingConstraint.cpp: (JSC::MarkingConstraint::MarkingConstraint): (JSC::MarkingConstraint::execute): (JSC::MarkingConstraint::quickWorkEstimate): (JSC::MarkingConstraint::workEstimate): (JSC::MarkingConstraint::doParallelWork): (JSC::MarkingConstraint::finishParallelWork): (JSC::MarkingConstraint::doParallelWorkImpl): (JSC::MarkingConstraint::finishParallelWorkImpl): * heap/MarkingConstraint.h: (JSC::MarkingConstraint::lastExecuteParallelism const): (JSC::MarkingConstraint::parallelism const): (JSC::MarkingConstraint::quickWorkEstimate): Deleted. (JSC::MarkingConstraint::workEstimate): Deleted. * heap/MarkingConstraintSet.cpp: (JSC::MarkingConstraintSet::MarkingConstraintSet): (JSC::MarkingConstraintSet::add): (JSC::MarkingConstraintSet::executeConvergence): (JSC::MarkingConstraintSet::executeConvergenceImpl): (JSC::MarkingConstraintSet::executeAll): (JSC::MarkingConstraintSet::ExecutionContext::ExecutionContext): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didVisitSomething const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::shouldTimeOut const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::drain): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didExecute const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::execute): Deleted. (): Deleted. * heap/MarkingConstraintSet.h: * heap/MarkingConstraintSolver.cpp: Added. (JSC::MarkingConstraintSolver::MarkingConstraintSolver): (JSC::MarkingConstraintSolver::~MarkingConstraintSolver): (JSC::MarkingConstraintSolver::didVisitSomething const): (JSC::MarkingConstraintSolver::execute): (JSC::MarkingConstraintSolver::drain): (JSC::MarkingConstraintSolver::converge): (JSC::MarkingConstraintSolver::runExecutionThread): (JSC::MarkingConstraintSolver::didExecute): * heap/MarkingConstraintSolver.h: Added. * heap/OpaqueRootSet.h: Removed. * heap/ParallelSourceAdapter.h: Added. (JSC::ParallelSourceAdapter::ParallelSourceAdapter): (JSC::createParallelSourceAdapter): * heap/SimpleMarkingConstraint.cpp: Added. (JSC::SimpleMarkingConstraint::SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::~SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::quickWorkEstimate): (JSC::SimpleMarkingConstraint::executeImpl): * heap/SimpleMarkingConstraint.h: Added. * heap/SlotVisitor.cpp: (JSC::SlotVisitor::didStartMarking): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::updateMutatorIsStopped): (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate const): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::performIncrementOfDraining): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): (JSC::SlotVisitor::waitForTermination): (JSC::SlotVisitor::addOpaqueRoot): Deleted. (JSC::SlotVisitor::containsOpaqueRoot const): Deleted. (JSC::SlotVisitor::containsOpaqueRootTriState const): Deleted. (JSC::SlotVisitor::mergeIfNecessary): Deleted. (JSC::SlotVisitor::mergeOpaqueRootsIfProfitable): Deleted. (JSC::SlotVisitor::mergeOpaqueRoots): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::addOpaqueRoot): (JSC::SlotVisitor::containsOpaqueRoot const): (JSC::SlotVisitor::vm): (JSC::SlotVisitor::vm const): * heap/Subspace.cpp: (JSC::Subspace::parallelAllocatorSource): (JSC::Subspace::parallelNotEmptyMarkedBlockSource): * heap/Subspace.h: * heap/SubspaceInlines.h: (JSC::Subspace::forEachMarkedCellInParallel): * heap/VisitCounter.h: Added. (JSC::VisitCounter::VisitCounter): (JSC::VisitCounter::visitCount const): * heap/VisitingTimeout.h: Removed. * heap/WeakBlock.cpp: (JSC::WeakBlock::specializedVisit): * runtime/Structure.cpp: (JSC::Structure::isCheapDuringGC): (JSC::Structure::markIfCheap): Source/WebCore: No new tests because no change in behavior. This change is best tested using DOM-GC-intensive benchmarks like Speedometer and Dromaeo. This parallelizes the DOM's output constraint, and makes some small changes to make this more scalable. * ForwardingHeaders/heap/SimpleMarkingConstraint.h: Added. * ForwardingHeaders/heap/VisitingTimeout.h: Removed. * Sources.txt: * WebCore.xcodeproj/project.pbxproj: * bindings/js/DOMGCOutputConstraint.cpp: Added. (WebCore::DOMGCOutputConstraint::DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::~DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::executeImpl): (WebCore::DOMGCOutputConstraint::doParallelWorkImpl): (WebCore::DOMGCOutputConstraint::finishParallelWorkImpl): * bindings/js/DOMGCOutputConstraint.h: Added. * bindings/js/WebCoreJSClientData.cpp: (WebCore::JSVMClientData::initNormalWorld): * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): Source/WTF: This does some changes to make it easier to do parallel constraint solving: - I finally removed dependencyWith. This was a silly construct whose only purpose is to confuse people about what it means to have a dependency chain. I took that as an opportunity to grealy simplify the GC's use of dependency chaining. - Added more logic to Deque<>, since I use it for part of the load balancer. - Made it possible to profile lock contention. See https://bugs.webkit.org/show_bug.cgi?id=180250#c0 for some preliminary measurements. - Introduced holdLockIf, which makes it easy to perform predicated lock acquisition. We use that to pick a lock in WebCore. - Introduced CountingLock. It's like WTF::Lock except it also enables optimistic read transactions sorta like Java's StampedLock. * WTF.xcodeproj/project.pbxproj: * wtf/Atomics.h: (WTF::dependency): (WTF::DependencyWith::DependencyWith): Deleted. (WTF::dependencyWith): Deleted. * wtf/BitVector.h: (WTF::BitVector::iterator::operator++): * wtf/CMakeLists.txt: * wtf/ConcurrentPtrHashSet.cpp: Added. (WTF::ConcurrentPtrHashSet::ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::~ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::deleteOldTables): (WTF::ConcurrentPtrHashSet::clear): (WTF::ConcurrentPtrHashSet::initialize): (WTF::ConcurrentPtrHashSet::addSlow): (WTF::ConcurrentPtrHashSet::resizeIfNecessary): (WTF::ConcurrentPtrHashSet::resizeAndAdd): (WTF::ConcurrentPtrHashSet::Table::create): * wtf/ConcurrentPtrHashSet.h: Added. (WTF::ConcurrentPtrHashSet::contains): (WTF::ConcurrentPtrHashSet::add): (WTF::ConcurrentPtrHashSet::size const): (WTF::ConcurrentPtrHashSet::Table::maxLoad const): (WTF::ConcurrentPtrHashSet::hash): (WTF::ConcurrentPtrHashSet::cast): (WTF::ConcurrentPtrHashSet::containsImpl const): (WTF::ConcurrentPtrHashSet::addImpl): * wtf/Deque.h: (WTF::inlineCapacity>::takeFirst): * wtf/FastMalloc.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): * wtf/Locker.h: (WTF::holdLockIf): * wtf/ScopedLambda.h: * wtf/SharedTask.h: (WTF::SharedTask<PassedResultType): (WTF::SharedTask<ResultType): Deleted. * wtf/StackShot.h: Added. (WTF::StackShot::StackShot): (WTF::StackShot::operator=): (WTF::StackShot::array const): (WTF::StackShot::size const): (WTF::StackShot::operator bool const): (WTF::StackShot::operator== const): (WTF::StackShot::hash const): (WTF::StackShot::isHashTableDeletedValue const): (WTF::StackShot::operator> const): (WTF::StackShot::deletedValueArray): (WTF::StackShotHash::hash): (WTF::StackShotHash::equal): * wtf/StackShotProfiler.h: Added. (WTF::StackShotProfiler::StackShotProfiler): (WTF::StackShotProfiler::profile): (WTF::StackShotProfiler::run): Tools: * Scripts/run-jsc-benchmarks: Add splay-latency test, since this change needed to be carefully validated with that benchmark. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/ConcurrentPtrHashSet.cpp: Added. This has unit tests of the new concurrent data structure. The tests focus on correctness under serial execution, which appears to be enough for now (it's so easy to catch a concurrency bug by just running the GC). (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/196360@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225524 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-05 17:53:57 +00:00
template<typename LockType>
struct EmptyLockHooks {
static LockType lockHook(LockType value) { return value; }
static LockType unlockHook(LockType value) { return value; }
static LockType parkHook(LockType value) { return value; }
static LockType handoffHook(LockType value) { return value; }
};
template<typename LockType, LockType isHeldBit, LockType hasParkedBit, typename Hooks = EmptyLockHooks<LockType>>
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
class LockAlgorithm {
Use constexpr instead of const in symbol definitions that are obviously constexpr. https://bugs.webkit.org/show_bug.cgi?id=201879 Rubber-stamped by Joseph Pecoraro. Source/bmalloc: * bmalloc/AvailableMemory.cpp: * bmalloc/IsoTLS.h: * bmalloc/Map.h: * bmalloc/Mutex.cpp: (bmalloc::Mutex::lockSlowCase): * bmalloc/PerThread.h: * bmalloc/Vector.h: * bmalloc/Zone.h: Source/JavaScriptCore: const may require external storage (at the compiler's whim) though these currently do not. constexpr makes it clear that the value is a literal constant that can be inlined. In most cases in the code, when we say static const, we actually mean static constexpr. I'm changing the code to reflect this. * API/JSAPIValueWrapper.h: * API/JSCallbackConstructor.h: * API/JSCallbackObject.h: * API/JSContextRef.cpp: * API/JSWrapperMap.mm: * API/tests/CompareAndSwapTest.cpp: * API/tests/TypedArrayCTest.cpp: * API/tests/testapi.mm: (testObjectiveCAPIMain): * KeywordLookupGenerator.py: (Trie.printAsC): * assembler/ARMv7Assembler.h: * assembler/AssemblerBuffer.h: * assembler/AssemblerCommon.h: * assembler/MacroAssembler.h: * assembler/MacroAssemblerARM64.h: * assembler/MacroAssemblerARM64E.h: * assembler/MacroAssemblerARMv7.h: * assembler/MacroAssemblerCodeRef.h: * assembler/MacroAssemblerMIPS.h: * assembler/MacroAssemblerX86.h: * assembler/MacroAssemblerX86Common.h: (JSC::MacroAssemblerX86Common::absDouble): (JSC::MacroAssemblerX86Common::negateDouble): * assembler/MacroAssemblerX86_64.h: * assembler/X86Assembler.h: * b3/B3Bank.h: * b3/B3CheckSpecial.h: * b3/B3DuplicateTails.cpp: * b3/B3EliminateCommonSubexpressions.cpp: * b3/B3FixSSA.cpp: * b3/B3FoldPathConstants.cpp: * b3/B3InferSwitches.cpp: * b3/B3Kind.h: * b3/B3LowerToAir.cpp: * b3/B3NativeTraits.h: * b3/B3ReduceDoubleToFloat.cpp: * b3/B3ReduceLoopStrength.cpp: * b3/B3ReduceStrength.cpp: * b3/B3ValueKey.h: * b3/air/AirAllocateRegistersByGraphColoring.cpp: * b3/air/AirAllocateStackByGraphColoring.cpp: * b3/air/AirArg.h: * b3/air/AirCCallSpecial.h: * b3/air/AirEmitShuffle.cpp: * b3/air/AirFixObviousSpills.cpp: * b3/air/AirFormTable.h: * b3/air/AirLowerAfterRegAlloc.cpp: * b3/air/AirPrintSpecial.h: * b3/air/AirStackAllocation.cpp: * b3/air/AirTmp.h: * b3/testb3_6.cpp: (testInterpreter): * bytecode/AccessCase.cpp: * bytecode/CallLinkStatus.cpp: * bytecode/CallVariant.h: * bytecode/CodeBlock.h: * bytecode/CodeOrigin.h: * bytecode/DFGExitProfile.h: * bytecode/DirectEvalCodeCache.h: * bytecode/ExecutableToCodeBlockEdge.h: * bytecode/GetterSetterAccessCase.cpp: * bytecode/LazyOperandValueProfile.h: * bytecode/ObjectPropertyCondition.h: * bytecode/ObjectPropertyConditionSet.cpp: * bytecode/PolymorphicAccess.cpp: * bytecode/PropertyCondition.h: * bytecode/SpeculatedType.h: * bytecode/StructureStubInfo.cpp: * bytecode/UnlinkedCodeBlock.cpp: (JSC::UnlinkedCodeBlock::typeProfilerExpressionInfoForBytecodeOffset): * bytecode/UnlinkedCodeBlock.h: * bytecode/UnlinkedEvalCodeBlock.h: * bytecode/UnlinkedFunctionCodeBlock.h: * bytecode/UnlinkedFunctionExecutable.h: * bytecode/UnlinkedModuleProgramCodeBlock.h: * bytecode/UnlinkedProgramCodeBlock.h: * bytecode/ValueProfile.h: * bytecode/VirtualRegister.h: * bytecode/Watchpoint.h: * bytecompiler/BytecodeGenerator.h: * bytecompiler/Label.h: * bytecompiler/NodesCodegen.cpp: (JSC::ThisNode::emitBytecode): * bytecompiler/RegisterID.h: * debugger/Breakpoint.h: * debugger/DebuggerParseData.cpp: * debugger/DebuggerPrimitives.h: * debugger/DebuggerScope.h: * dfg/DFGAbstractHeap.h: * dfg/DFGAbstractValue.h: * dfg/DFGArgumentsEliminationPhase.cpp: * dfg/DFGByteCodeParser.cpp: * dfg/DFGCSEPhase.cpp: * dfg/DFGCommon.h: * dfg/DFGCompilationKey.h: * dfg/DFGDesiredGlobalProperty.h: * dfg/DFGEdgeDominates.h: * dfg/DFGEpoch.h: * dfg/DFGForAllKills.h: (JSC::DFG::forAllKilledNodesAtNodeIndex): * dfg/DFGGraph.cpp: (JSC::DFG::Graph::isLiveInBytecode): * dfg/DFGHeapLocation.h: * dfg/DFGInPlaceAbstractState.cpp: * dfg/DFGIntegerCheckCombiningPhase.cpp: * dfg/DFGIntegerRangeOptimizationPhase.cpp: * dfg/DFGInvalidationPointInjectionPhase.cpp: * dfg/DFGLICMPhase.cpp: * dfg/DFGLazyNode.h: * dfg/DFGMinifiedID.h: * dfg/DFGMovHintRemovalPhase.cpp: * dfg/DFGNodeFlowProjection.h: * dfg/DFGNodeType.h: * dfg/DFGObjectAllocationSinkingPhase.cpp: * dfg/DFGPhantomInsertionPhase.cpp: * dfg/DFGPromotedHeapLocation.h: * dfg/DFGPropertyTypeKey.h: * dfg/DFGPureValue.h: * dfg/DFGPutStackSinkingPhase.cpp: * dfg/DFGRegisterBank.h: * dfg/DFGSSAConversionPhase.cpp: * dfg/DFGSSALoweringPhase.cpp: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::compileDoubleRep): (JSC::DFG::compileClampDoubleToByte): (JSC::DFG::SpeculativeJIT::compileArithRounding): (JSC::DFG::compileArithPowIntegerFastPath): (JSC::DFG::SpeculativeJIT::compileArithPow): (JSC::DFG::SpeculativeJIT::emitBinarySwitchStringRecurse): * dfg/DFGStackLayoutPhase.cpp: * dfg/DFGStoreBarrierInsertionPhase.cpp: * dfg/DFGStrengthReductionPhase.cpp: * dfg/DFGStructureAbstractValue.h: * dfg/DFGVarargsForwardingPhase.cpp: * dfg/DFGVariableEventStream.cpp: (JSC::DFG::VariableEventStream::reconstruct const): * dfg/DFGWatchpointCollectionPhase.cpp: * disassembler/ARM64/A64DOpcode.h: * ftl/FTLLocation.h: * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compileArithRandom): * ftl/FTLSlowPathCall.cpp: * ftl/FTLSlowPathCallKey.h: * heap/CellContainer.h: * heap/CellState.h: * heap/ConservativeRoots.h: * heap/GCSegmentedArray.h: * heap/HandleBlock.h: * heap/Heap.cpp: (JSC::Heap::updateAllocationLimits): * heap/Heap.h: * heap/HeapSnapshot.h: * heap/HeapUtil.h: (JSC::HeapUtil::findGCObjectPointersForMarking): * heap/IncrementalSweeper.cpp: * heap/LargeAllocation.h: * heap/MarkedBlock.cpp: * heap/Strong.h: * heap/VisitRaceKey.h: * heap/Weak.h: * heap/WeakBlock.h: * inspector/JSInjectedScriptHost.h: * inspector/JSInjectedScriptHostPrototype.h: * inspector/JSJavaScriptCallFrame.h: * inspector/JSJavaScriptCallFramePrototype.h: * inspector/agents/InspectorConsoleAgent.cpp: * inspector/agents/InspectorRuntimeAgent.cpp: (Inspector::InspectorRuntimeAgent::getRuntimeTypesForVariablesAtOffsets): * inspector/scripts/codegen/generate_cpp_protocol_types_header.py: (CppProtocolTypesHeaderGenerator._generate_versions): * inspector/scripts/tests/generic/expected/version.json-result: * interpreter/Interpreter.h: * interpreter/ShadowChicken.cpp: * jit/BinarySwitch.cpp: * jit/CallFrameShuffler.h: * jit/ExecutableAllocator.h: * jit/FPRInfo.h: * jit/GPRInfo.h: * jit/ICStats.h: * jit/JITThunks.h: * jit/Reg.h: * jit/RegisterSet.h: * jit/TempRegisterSet.h: * jsc.cpp: * parser/ASTBuilder.h: * parser/Nodes.h: * parser/SourceCodeKey.h: * parser/SyntaxChecker.h: * parser/VariableEnvironment.h: * profiler/ProfilerOrigin.h: * profiler/ProfilerOriginStack.h: * profiler/ProfilerUID.h: * runtime/AbstractModuleRecord.cpp: * runtime/ArrayBufferNeuteringWatchpointSet.h: * runtime/ArrayConstructor.h: * runtime/ArrayConventions.h: * runtime/ArrayIteratorPrototype.h: * runtime/ArrayPrototype.cpp: (JSC::setLength): * runtime/AsyncFromSyncIteratorPrototype.h: * runtime/AsyncGeneratorFunctionPrototype.h: * runtime/AsyncGeneratorPrototype.h: * runtime/AsyncIteratorPrototype.h: * runtime/AtomicsObject.cpp: * runtime/BigIntConstructor.h: * runtime/BigIntPrototype.h: * runtime/BooleanPrototype.h: * runtime/ClonedArguments.h: * runtime/CodeCache.h: * runtime/ControlFlowProfiler.h: * runtime/CustomGetterSetter.h: * runtime/DateConstructor.h: * runtime/DatePrototype.h: * runtime/DefinePropertyAttributes.h: * runtime/ErrorPrototype.h: * runtime/EvalExecutable.h: * runtime/Exception.h: * runtime/ExceptionHelpers.cpp: (JSC::invalidParameterInSourceAppender): (JSC::invalidParameterInstanceofSourceAppender): * runtime/ExceptionHelpers.h: * runtime/ExecutableBase.h: * runtime/FunctionExecutable.h: * runtime/FunctionRareData.h: * runtime/GeneratorPrototype.h: * runtime/GenericArguments.h: * runtime/GenericOffset.h: * runtime/GetPutInfo.h: * runtime/GetterSetter.h: * runtime/GlobalExecutable.h: * runtime/Identifier.h: * runtime/InspectorInstrumentationObject.h: * runtime/InternalFunction.h: * runtime/IntlCollatorConstructor.h: * runtime/IntlCollatorPrototype.h: * runtime/IntlDateTimeFormatConstructor.h: * runtime/IntlDateTimeFormatPrototype.h: * runtime/IntlNumberFormatConstructor.h: * runtime/IntlNumberFormatPrototype.h: * runtime/IntlObject.h: * runtime/IntlPluralRulesConstructor.h: * runtime/IntlPluralRulesPrototype.h: * runtime/IteratorPrototype.h: * runtime/JSArray.cpp: (JSC::JSArray::tryCreateUninitializedRestricted): * runtime/JSArray.h: * runtime/JSArrayBuffer.h: * runtime/JSArrayBufferView.h: * runtime/JSBigInt.h: * runtime/JSCJSValue.h: * runtime/JSCell.h: * runtime/JSCustomGetterSetterFunction.h: * runtime/JSDataView.h: * runtime/JSDataViewPrototype.h: * runtime/JSDestructibleObject.h: * runtime/JSFixedArray.h: * runtime/JSGenericTypedArrayView.h: * runtime/JSGlobalLexicalEnvironment.h: * runtime/JSGlobalObject.h: * runtime/JSImmutableButterfly.h: * runtime/JSInternalPromiseConstructor.h: * runtime/JSInternalPromiseDeferred.h: * runtime/JSInternalPromisePrototype.h: * runtime/JSLexicalEnvironment.h: * runtime/JSModuleEnvironment.h: * runtime/JSModuleLoader.h: * runtime/JSModuleNamespaceObject.h: * runtime/JSNonDestructibleProxy.h: * runtime/JSONObject.cpp: * runtime/JSONObject.h: * runtime/JSObject.h: * runtime/JSPromiseConstructor.h: * runtime/JSPromiseDeferred.h: * runtime/JSPromisePrototype.h: * runtime/JSPropertyNameEnumerator.h: * runtime/JSProxy.h: * runtime/JSScope.h: * runtime/JSScriptFetchParameters.h: * runtime/JSScriptFetcher.h: * runtime/JSSegmentedVariableObject.h: * runtime/JSSourceCode.h: * runtime/JSString.cpp: * runtime/JSString.h: * runtime/JSSymbolTableObject.h: * runtime/JSTemplateObjectDescriptor.h: * runtime/JSTypeInfo.h: * runtime/MapPrototype.h: * runtime/MinimumReservedZoneSize.h: * runtime/ModuleProgramExecutable.h: * runtime/NativeExecutable.h: * runtime/NativeFunction.h: * runtime/NativeStdFunctionCell.h: * runtime/NumberConstructor.h: * runtime/NumberPrototype.h: * runtime/ObjectConstructor.h: * runtime/ObjectPrototype.h: * runtime/ProgramExecutable.h: * runtime/PromiseDeferredTimer.cpp: * runtime/PropertyMapHashTable.h: * runtime/PropertyNameArray.h: (JSC::PropertyNameArray::add): * runtime/PrototypeKey.h: * runtime/ProxyConstructor.h: * runtime/ProxyObject.cpp: (JSC::ProxyObject::performGetOwnPropertyNames): * runtime/ProxyRevoke.h: * runtime/ReflectObject.h: * runtime/RegExp.h: * runtime/RegExpCache.h: * runtime/RegExpConstructor.h: * runtime/RegExpKey.h: * runtime/RegExpObject.h: * runtime/RegExpPrototype.h: * runtime/RegExpStringIteratorPrototype.h: * runtime/SamplingProfiler.cpp: * runtime/ScopedArgumentsTable.h: * runtime/ScriptExecutable.h: * runtime/SetPrototype.h: * runtime/SmallStrings.h: * runtime/SparseArrayValueMap.h: * runtime/StringConstructor.h: * runtime/StringIteratorPrototype.h: * runtime/StringObject.h: * runtime/StringPrototype.h: * runtime/Structure.h: * runtime/StructureChain.h: * runtime/StructureRareData.h: * runtime/StructureTransitionTable.h: * runtime/Symbol.h: * runtime/SymbolConstructor.h: * runtime/SymbolPrototype.h: * runtime/SymbolTable.h: * runtime/TemplateObjectDescriptor.h: * runtime/TypeProfiler.cpp: * runtime/TypeProfiler.h: * runtime/TypeProfilerLog.cpp: * runtime/VarOffset.h: * testRegExp.cpp: * tools/HeapVerifier.cpp: (JSC::HeapVerifier::checkIfRecorded): * tools/JSDollarVM.cpp: * wasm/WasmB3IRGenerator.cpp: * wasm/WasmBBQPlan.cpp: * wasm/WasmFaultSignalHandler.cpp: * wasm/WasmFunctionParser.h: * wasm/WasmOMGForOSREntryPlan.cpp: * wasm/WasmOMGPlan.cpp: * wasm/WasmPlan.cpp: * wasm/WasmSignature.cpp: * wasm/WasmSignature.h: * wasm/WasmWorklist.cpp: * wasm/js/JSWebAssembly.h: * wasm/js/JSWebAssemblyCodeBlock.h: * wasm/js/WebAssemblyCompileErrorConstructor.h: * wasm/js/WebAssemblyCompileErrorPrototype.h: * wasm/js/WebAssemblyFunction.h: * wasm/js/WebAssemblyInstanceConstructor.h: * wasm/js/WebAssemblyInstancePrototype.h: * wasm/js/WebAssemblyLinkErrorConstructor.h: * wasm/js/WebAssemblyLinkErrorPrototype.h: * wasm/js/WebAssemblyMemoryConstructor.h: * wasm/js/WebAssemblyMemoryPrototype.h: * wasm/js/WebAssemblyModuleConstructor.h: * wasm/js/WebAssemblyModulePrototype.h: * wasm/js/WebAssemblyRuntimeErrorConstructor.h: * wasm/js/WebAssemblyRuntimeErrorPrototype.h: * wasm/js/WebAssemblyTableConstructor.h: * wasm/js/WebAssemblyTablePrototype.h: * wasm/js/WebAssemblyToJSCallee.h: * yarr/Yarr.h: * yarr/YarrParser.h: * yarr/generateYarrCanonicalizeUnicode: Source/WebCore: No new tests. Covered by existing tests. * bindings/js/JSDOMConstructorBase.h: * bindings/js/JSDOMWindowProperties.h: * bindings/scripts/CodeGeneratorJS.pm: (GenerateHeader): (GeneratePrototypeDeclaration): * bindings/scripts/test/JS/JSTestActiveDOMObject.h: * bindings/scripts/test/JS/JSTestEnabledBySetting.h: * bindings/scripts/test/JS/JSTestEnabledForContext.h: * bindings/scripts/test/JS/JSTestEventTarget.h: * bindings/scripts/test/JS/JSTestGlobalObject.h: * bindings/scripts/test/JS/JSTestIndexedSetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestIndexedSetterThrowingException.h: * bindings/scripts/test/JS/JSTestIndexedSetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedAndIndexedSetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedAndIndexedSetterThrowingException.h: * bindings/scripts/test/JS/JSTestNamedAndIndexedSetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedDeleterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedDeleterThrowingException.h: * bindings/scripts/test/JS/JSTestNamedDeleterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedDeleterWithIndexedGetter.h: * bindings/scripts/test/JS/JSTestNamedGetterCallWith.h: * bindings/scripts/test/JS/JSTestNamedGetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedGetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedSetterNoIdentifier.h: * bindings/scripts/test/JS/JSTestNamedSetterThrowingException.h: * bindings/scripts/test/JS/JSTestNamedSetterWithIdentifier.h: * bindings/scripts/test/JS/JSTestNamedSetterWithIndexedGetter.h: * bindings/scripts/test/JS/JSTestNamedSetterWithIndexedGetterAndSetter.h: * bindings/scripts/test/JS/JSTestNamedSetterWithOverrideBuiltins.h: * bindings/scripts/test/JS/JSTestNamedSetterWithUnforgableProperties.h: * bindings/scripts/test/JS/JSTestNamedSetterWithUnforgablePropertiesAndOverrideBuiltins.h: * bindings/scripts/test/JS/JSTestObj.h: * bindings/scripts/test/JS/JSTestOverrideBuiltins.h: * bindings/scripts/test/JS/JSTestPluginInterface.h: * bindings/scripts/test/JS/JSTestTypedefs.h: * bridge/objc/objc_runtime.h: * bridge/runtime_array.h: * bridge/runtime_method.h: * bridge/runtime_object.h: Source/WebKit: * WebProcess/Plugins/Netscape/JSNPObject.h: Source/WTF: * wtf/Assertions.cpp: * wtf/AutomaticThread.cpp: * wtf/BitVector.h: * wtf/Bitmap.h: * wtf/BloomFilter.h: * wtf/Brigand.h: * wtf/CheckedArithmetic.h: * wtf/CrossThreadCopier.h: * wtf/CurrentTime.cpp: * wtf/DataLog.cpp: * wtf/DateMath.cpp: (WTF::daysFrom1970ToYear): * wtf/DeferrableRefCounted.h: * wtf/GetPtr.h: * wtf/HashFunctions.h: * wtf/HashMap.h: * wtf/HashTable.h: * wtf/HashTraits.h: * wtf/JSONValues.cpp: * wtf/JSONValues.h: * wtf/ListHashSet.h: * wtf/Lock.h: * wtf/LockAlgorithm.h: * wtf/LockAlgorithmInlines.h: (WTF::Hooks>::lockSlow): * wtf/Logger.h: * wtf/LoggerHelper.h: (WTF::LoggerHelper::childLogIdentifier const): * wtf/MainThread.cpp: * wtf/MetaAllocatorPtr.h: * wtf/MonotonicTime.h: * wtf/NaturalLoops.h: (WTF::NaturalLoops::NaturalLoops): * wtf/ObjectIdentifier.h: * wtf/RAMSize.cpp: * wtf/Ref.h: * wtf/RefPtr.h: * wtf/RetainPtr.h: * wtf/SchedulePair.h: * wtf/StackShot.h: * wtf/StdLibExtras.h: * wtf/TinyPtrSet.h: * wtf/URL.cpp: * wtf/URLHash.h: * wtf/URLParser.cpp: (WTF::URLParser::defaultPortForProtocol): * wtf/Vector.h: * wtf/VectorTraits.h: * wtf/WallTime.h: * wtf/WeakHashSet.h: * wtf/WordLock.h: * wtf/cocoa/CPUTimeCocoa.cpp: * wtf/cocoa/MemoryPressureHandlerCocoa.mm: * wtf/persistence/PersistentDecoder.h: * wtf/persistence/PersistentEncoder.h: * wtf/text/AtomStringHash.h: * wtf/text/CString.h: * wtf/text/StringBuilder.cpp: (WTF::expandedCapacity): * wtf/text/StringHash.h: * wtf/text/StringImpl.h: * wtf/text/StringToIntegerConversion.h: (WTF::toIntegralType): * wtf/text/SymbolRegistry.h: * wtf/text/TextStream.cpp: (WTF::hasFractions): * wtf/text/WTFString.h: * wtf/text/cocoa/TextBreakIteratorInternalICUCocoa.cpp: Canonical link: https://commits.webkit.org/215538@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@250005 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2019-09-18 00:36:19 +00:00
static constexpr bool verbose = false;
static constexpr LockType mask = isHeldBit | hasParkedBit;
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
public:
static bool lockFastAssumingZero(Atomic<LockType>& lock)
{
GC constraint solving should be parallel https://bugs.webkit.org/show_bug.cgi?id=179934 Reviewed by JF Bastien. PerformanceTests: Added a version of splay that measures latency in a way that run-jsc-benchmarks groks. * Octane/splay.js: Added. (this.Setup.setup.setup): (this.TearDown.tearDown.tearDown): (Benchmark): (BenchmarkResult): (BenchmarkResult.prototype.valueOf): (BenchmarkSuite): (alert): (Math.random): (BenchmarkSuite.ResetRNG): (RunStep): (BenchmarkSuite.RunSuites): (BenchmarkSuite.CountBenchmarks): (BenchmarkSuite.GeometricMean): (BenchmarkSuite.GeometricMeanTime): (BenchmarkSuite.AverageAbovePercentile): (BenchmarkSuite.GeometricMeanLatency): (BenchmarkSuite.FormatScore): (BenchmarkSuite.prototype.NotifyStep): (BenchmarkSuite.prototype.NotifyResult): (BenchmarkSuite.prototype.NotifyError): (BenchmarkSuite.prototype.RunSingleBenchmark): (RunNextSetup): (RunNextBenchmark): (RunNextTearDown): (BenchmarkSuite.prototype.RunStep): (GeneratePayloadTree): (GenerateKey): (SplayUpdateStats): (InsertNewNode): (SplaySetup): (SplayTearDown): (SplayRun): (SplayTree): (SplayTree.prototype.isEmpty): (SplayTree.prototype.insert): (SplayTree.prototype.remove): (SplayTree.prototype.find): (SplayTree.prototype.findMax): (SplayTree.prototype.findGreatestLessThan): (SplayTree.prototype.exportKeys): (SplayTree.prototype.splay_): (SplayTree.Node): (SplayTree.Node.prototype.traverse_): (report): (start): Source/JavaScriptCore: This makes it possible to do constraint solving in parallel. This looks like a 1% Speedometer speed-up. It's more than 1% on trunk-Speedometer. The constraint solver supports running constraints in parallel in two different ways: - Run multiple constraints in parallel to each other. This only works for constraints that can tolerate other constraints running concurrently to them (constraint.concurrency() == ConstraintConcurrency::Concurrent). This is the most basic kind of parallelism that the constraint solver supports. All constraints except the JSC SPI constraints are concurrent. We could probably make them concurrent, but I'm playing it safe for now. - A constraint can create parallel work for itself, which the constraint solver will interleave with other stuff. A constraint can report that it has parallel work by returning ConstraintParallelism::Parallel from its executeImpl() function. Then the solver will allow that constraint's doParallelWorkImpl() function to run on as many GC marker threads as are available, for as long as that function wants to run. It's not possible to have a non-concurrent constraint that creates parallel work. The parallelism is implemented in terms of the existing GC marker threads. This turns out to be most natural for two reasons: - No need to start any other threads. - The constraints all want to be passed a SlotVisitor. Running on the marker threads means having access to those threads' SlotVisitors. Also, it means less load balancing. The solver will create work on each marking thread's SlotVisitor. When the solver is done "stealing" a marker thread, that thread will have work it can start doing immediately. Before this change, we had to contribute the work found by the constraint solver to the global worklist so that it could be distributed to the marker threads by load balancing. This change probably helps to avoid that load balancing step. A lot of this change is about making it easy to iterate GC data structures in parallel. This change makes almost all constraints parallel-enabled, but only the DOM's output constraint uses the parallel work API. That constraint iterates the marked cells in two subspaces. This change makes it very easy to compose parallel iterators over subspaces, allocators, blocks, and cells. The marked cell parallel iterator is composed out of parallel iterators for the others. A parallel iterator is just an iterator that can do an atomic next() very quickly. We abstract them using RefPtr<SharedTask<...()>>, where ... is the type returned from the iterator. We know it's done when it returns a falsish version of ... (in the current code, that's always a pointer type, so done is indicated by null). * API/JSMarkingConstraintPrivate.cpp: (JSContextGroupAddMarkingConstraint): * API/JSVirtualMachine.mm: (scanExternalObjectGraph): (scanExternalRememberedSet): * JavaScriptCore.xcodeproj/project.pbxproj: * Sources.txt: * bytecode/AccessCase.cpp: (JSC::AccessCase::propagateTransitions const): * bytecode/CodeBlock.cpp: (JSC::CodeBlock::visitWeakly): (JSC::CodeBlock::shouldJettisonDueToOldAge): (JSC::shouldMarkTransition): (JSC::CodeBlock::propagateTransitions): (JSC::CodeBlock::determineLiveness): * dfg/DFGWorklist.cpp: * ftl/FTLCompile.cpp: (JSC::FTL::compile): * heap/ConstraintParallelism.h: Added. (WTF::printInternal): * heap/Heap.cpp: (JSC::Heap::Heap): (JSC::Heap::addToRememberedSet): (JSC::Heap::runFixpointPhase): (JSC::Heap::stopThePeriphery): (JSC::Heap::resumeThePeriphery): (JSC::Heap::addCoreConstraints): (JSC::Heap::setBonusVisitorTask): (JSC::Heap::runTaskInParallel): (JSC::Heap::forEachSlotVisitor): Deleted. * heap/Heap.h: (JSC::Heap::worldIsRunning const): (JSC::Heap::runFunctionInParallel): * heap/HeapInlines.h: (JSC::Heap::worldIsStopped const): (JSC::Heap::isMarked): (JSC::Heap::incrementDeferralDepth): (JSC::Heap::decrementDeferralDepth): (JSC::Heap::decrementDeferralDepthAndGCIfNeeded): (JSC::Heap::forEachSlotVisitor): (JSC::Heap::collectorBelievesThatTheWorldIsStopped const): Deleted. (JSC::Heap::isMarkedConcurrently): Deleted. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::appendNode): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): Deleted. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::parallelNotEmptyBlockSource): * heap/MarkedAllocator.h: * heap/MarkedBlock.h: (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::areMarksStaleWithDependency): Deleted. (JSC::MarkedBlock::isMarkedConcurrently): Deleted. * heap/MarkedSpace.h: (JSC::MarkedSpace::activeWeakSetsBegin): (JSC::MarkedSpace::activeWeakSetsEnd): (JSC::MarkedSpace::newActiveWeakSetsBegin): (JSC::MarkedSpace::newActiveWeakSetsEnd): * heap/MarkingConstraint.cpp: (JSC::MarkingConstraint::MarkingConstraint): (JSC::MarkingConstraint::execute): (JSC::MarkingConstraint::quickWorkEstimate): (JSC::MarkingConstraint::workEstimate): (JSC::MarkingConstraint::doParallelWork): (JSC::MarkingConstraint::finishParallelWork): (JSC::MarkingConstraint::doParallelWorkImpl): (JSC::MarkingConstraint::finishParallelWorkImpl): * heap/MarkingConstraint.h: (JSC::MarkingConstraint::lastExecuteParallelism const): (JSC::MarkingConstraint::parallelism const): (JSC::MarkingConstraint::quickWorkEstimate): Deleted. (JSC::MarkingConstraint::workEstimate): Deleted. * heap/MarkingConstraintSet.cpp: (JSC::MarkingConstraintSet::MarkingConstraintSet): (JSC::MarkingConstraintSet::add): (JSC::MarkingConstraintSet::executeConvergence): (JSC::MarkingConstraintSet::executeConvergenceImpl): (JSC::MarkingConstraintSet::executeAll): (JSC::MarkingConstraintSet::ExecutionContext::ExecutionContext): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didVisitSomething const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::shouldTimeOut const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::drain): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didExecute const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::execute): Deleted. (): Deleted. * heap/MarkingConstraintSet.h: * heap/MarkingConstraintSolver.cpp: Added. (JSC::MarkingConstraintSolver::MarkingConstraintSolver): (JSC::MarkingConstraintSolver::~MarkingConstraintSolver): (JSC::MarkingConstraintSolver::didVisitSomething const): (JSC::MarkingConstraintSolver::execute): (JSC::MarkingConstraintSolver::drain): (JSC::MarkingConstraintSolver::converge): (JSC::MarkingConstraintSolver::runExecutionThread): (JSC::MarkingConstraintSolver::didExecute): * heap/MarkingConstraintSolver.h: Added. * heap/OpaqueRootSet.h: Removed. * heap/ParallelSourceAdapter.h: Added. (JSC::ParallelSourceAdapter::ParallelSourceAdapter): (JSC::createParallelSourceAdapter): * heap/SimpleMarkingConstraint.cpp: Added. (JSC::SimpleMarkingConstraint::SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::~SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::quickWorkEstimate): (JSC::SimpleMarkingConstraint::executeImpl): * heap/SimpleMarkingConstraint.h: Added. * heap/SlotVisitor.cpp: (JSC::SlotVisitor::didStartMarking): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::updateMutatorIsStopped): (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate const): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::performIncrementOfDraining): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): (JSC::SlotVisitor::waitForTermination): (JSC::SlotVisitor::addOpaqueRoot): Deleted. (JSC::SlotVisitor::containsOpaqueRoot const): Deleted. (JSC::SlotVisitor::containsOpaqueRootTriState const): Deleted. (JSC::SlotVisitor::mergeIfNecessary): Deleted. (JSC::SlotVisitor::mergeOpaqueRootsIfProfitable): Deleted. (JSC::SlotVisitor::mergeOpaqueRoots): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::addOpaqueRoot): (JSC::SlotVisitor::containsOpaqueRoot const): (JSC::SlotVisitor::vm): (JSC::SlotVisitor::vm const): * heap/Subspace.cpp: (JSC::Subspace::parallelAllocatorSource): (JSC::Subspace::parallelNotEmptyMarkedBlockSource): * heap/Subspace.h: * heap/SubspaceInlines.h: (JSC::Subspace::forEachMarkedCellInParallel): * heap/VisitCounter.h: Added. (JSC::VisitCounter::VisitCounter): (JSC::VisitCounter::visitCount const): * heap/VisitingTimeout.h: Removed. * heap/WeakBlock.cpp: (JSC::WeakBlock::specializedVisit): * runtime/Structure.cpp: (JSC::Structure::isCheapDuringGC): (JSC::Structure::markIfCheap): Source/WebCore: No new tests because no change in behavior. This change is best tested using DOM-GC-intensive benchmarks like Speedometer and Dromaeo. This parallelizes the DOM's output constraint, and makes some small changes to make this more scalable. * ForwardingHeaders/heap/SimpleMarkingConstraint.h: Added. * ForwardingHeaders/heap/VisitingTimeout.h: Removed. * Sources.txt: * WebCore.xcodeproj/project.pbxproj: * bindings/js/DOMGCOutputConstraint.cpp: Added. (WebCore::DOMGCOutputConstraint::DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::~DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::executeImpl): (WebCore::DOMGCOutputConstraint::doParallelWorkImpl): (WebCore::DOMGCOutputConstraint::finishParallelWorkImpl): * bindings/js/DOMGCOutputConstraint.h: Added. * bindings/js/WebCoreJSClientData.cpp: (WebCore::JSVMClientData::initNormalWorld): * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): Source/WTF: This does some changes to make it easier to do parallel constraint solving: - I finally removed dependencyWith. This was a silly construct whose only purpose is to confuse people about what it means to have a dependency chain. I took that as an opportunity to grealy simplify the GC's use of dependency chaining. - Added more logic to Deque<>, since I use it for part of the load balancer. - Made it possible to profile lock contention. See https://bugs.webkit.org/show_bug.cgi?id=180250#c0 for some preliminary measurements. - Introduced holdLockIf, which makes it easy to perform predicated lock acquisition. We use that to pick a lock in WebCore. - Introduced CountingLock. It's like WTF::Lock except it also enables optimistic read transactions sorta like Java's StampedLock. * WTF.xcodeproj/project.pbxproj: * wtf/Atomics.h: (WTF::dependency): (WTF::DependencyWith::DependencyWith): Deleted. (WTF::dependencyWith): Deleted. * wtf/BitVector.h: (WTF::BitVector::iterator::operator++): * wtf/CMakeLists.txt: * wtf/ConcurrentPtrHashSet.cpp: Added. (WTF::ConcurrentPtrHashSet::ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::~ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::deleteOldTables): (WTF::ConcurrentPtrHashSet::clear): (WTF::ConcurrentPtrHashSet::initialize): (WTF::ConcurrentPtrHashSet::addSlow): (WTF::ConcurrentPtrHashSet::resizeIfNecessary): (WTF::ConcurrentPtrHashSet::resizeAndAdd): (WTF::ConcurrentPtrHashSet::Table::create): * wtf/ConcurrentPtrHashSet.h: Added. (WTF::ConcurrentPtrHashSet::contains): (WTF::ConcurrentPtrHashSet::add): (WTF::ConcurrentPtrHashSet::size const): (WTF::ConcurrentPtrHashSet::Table::maxLoad const): (WTF::ConcurrentPtrHashSet::hash): (WTF::ConcurrentPtrHashSet::cast): (WTF::ConcurrentPtrHashSet::containsImpl const): (WTF::ConcurrentPtrHashSet::addImpl): * wtf/Deque.h: (WTF::inlineCapacity>::takeFirst): * wtf/FastMalloc.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): * wtf/Locker.h: (WTF::holdLockIf): * wtf/ScopedLambda.h: * wtf/SharedTask.h: (WTF::SharedTask<PassedResultType): (WTF::SharedTask<ResultType): Deleted. * wtf/StackShot.h: Added. (WTF::StackShot::StackShot): (WTF::StackShot::operator=): (WTF::StackShot::array const): (WTF::StackShot::size const): (WTF::StackShot::operator bool const): (WTF::StackShot::operator== const): (WTF::StackShot::hash const): (WTF::StackShot::isHashTableDeletedValue const): (WTF::StackShot::operator> const): (WTF::StackShot::deletedValueArray): (WTF::StackShotHash::hash): (WTF::StackShotHash::equal): * wtf/StackShotProfiler.h: Added. (WTF::StackShotProfiler::StackShotProfiler): (WTF::StackShotProfiler::profile): (WTF::StackShotProfiler::run): Tools: * Scripts/run-jsc-benchmarks: Add splay-latency test, since this change needed to be carefully validated with that benchmark. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/ConcurrentPtrHashSet.cpp: Added. This has unit tests of the new concurrent data structure. The tests focus on correctness under serial execution, which appears to be enough for now (it's so easy to catch a concurrency bug by just running the GC). (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/196360@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225524 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-05 17:53:57 +00:00
return lock.compareExchangeWeak(0, Hooks::lockHook(isHeldBit), std::memory_order_acquire);
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
}
static bool lockFast(Atomic<LockType>& lock)
{
WTF should make it super easy to do ARM concurrency tricks https://bugs.webkit.org/show_bug.cgi?id=169300 Reviewed by Mark Lam. Source/JavaScriptCore: This changes a bunch of GC hot paths to use new concurrency APIs that lead to optimal code on both x86 (fully leverage TSO, transactions become CAS loops) and ARM (use dependency chains for fencing, transactions become LL/SC loops). While inspecting the machine code, I found other opportunities for improvement, like inlining the "am I marked" part of the marking functions. * heap/Heap.cpp: (JSC::Heap::setGCDidJIT): * heap/HeapInlines.h: (JSC::Heap::testAndSetMarked): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): (JSC::LargeAllocation::aboutToMark): (JSC::LargeAllocation::testAndSetMarked): * heap/MarkedBlock.h: (JSC::MarkedBlock::areMarksStaleWithDependency): (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarkedConcurrently): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::testAndSetMarked): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::appendSlow): (JSC::SlotVisitor::appendHiddenSlow): (JSC::SlotVisitor::appendHiddenSlowImpl): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendUnbarriered): Deleted. (JSC::SlotVisitor::appendHidden): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::appendUnbarriered): (JSC::SlotVisitor::appendHidden): (JSC::SlotVisitor::append): (JSC::SlotVisitor::appendValues): (JSC::SlotVisitor::appendValuesHidden): * runtime/CustomGetterSetter.cpp: * runtime/JSObject.cpp: (JSC::JSObject::visitButterflyImpl): * runtime/JSObject.h: Source/WTF: This adds Atomic<>::loadLink and Atomic<>::storeCond, available only when HAVE(LL_SC). It abstracts loadLink/storeCond behind prepare/attempt. You can write prepare/attempt loops whenever your loop fits into the least common denominator of LL/SC and CAS. This modifies Atomic<>::transaction to use prepare/attempt. So, if you write your loop using Atomic<>::transaction, then you get LL/SC for free. Depending on the kind of transaction you are doing, you may not want to perform an LL until you have a chance to just load the current value. Atomic<>::transaction() assumes that you do not care to have any ordering guarantees in that case. If you think that the transaction has a good chance of aborting this way, you want Atomic<>::transaction() to first do a plain load. But if you don't think that such an abort is likely, then you want to go straight to the LL. The API supports this concept via TransactionAbortLikelihood. Additionally, this redoes the depend/consume API to be dead simple. Dependency is unsigned. You get a dependency on a loaded value by just saying dependency(loadedValue). You consume the dependency by using it as a bonus index to some pointer dereference. This is made easy with the consume<T*>(ptr, dependency) helper. In those cases where you want to pass around both a computed value and a dependency, there's DependencyWith<T>. But you won't need it in most cases. The loaded value or any value computed from the loaded value is a fine input to dependency()! This change updates a bunch of hot paths to use the new APIs. Using transaction() gives us optimal LL/SC loops for object marking and lock acquisition. This change also updates a bunch of hot paths to use dependency()/consume(). This is a significant Octane/splay speed-up on ARM. * wtf/Atomics.h: (WTF::hasFence): (WTF::Atomic::prepare): (WTF::Atomic::attempt): (WTF::Atomic::transaction): (WTF::Atomic::transactionRelaxed): (WTF::nullDependency): (WTF::dependency): (WTF::DependencyWith::DependencyWith): (WTF::dependencyWith): (WTF::consume): (WTF::Atomic::tryTransactionRelaxed): Deleted. (WTF::Atomic::tryTransaction): Deleted. (WTF::zeroWithConsumeDependency): Deleted. (WTF::consumeLoad): Deleted. * wtf/Bitmap.h: (WTF::WordType>::get): (WTF::WordType>::concurrentTestAndSet): (WTF::WordType>::concurrentTestAndClear): * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlockSlow): * wtf/Platform.h: Tools: This vastly simplifies the consume API. The new API is thoroughly tested by being used in the GC's guts. I think that unit tests are a pain to maintain, so we shouldn't have them unless we are legitimately worried about coverage. We're not in this case. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Consume.cpp: Removed. Canonical link: https://commits.webkit.org/186402@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@213645 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-03-09 17:40:10 +00:00
return lock.transaction(
[&] (LockType& value) -> bool {
if (value & isHeldBit)
return false;
value |= isHeldBit;
GC constraint solving should be parallel https://bugs.webkit.org/show_bug.cgi?id=179934 Reviewed by JF Bastien. PerformanceTests: Added a version of splay that measures latency in a way that run-jsc-benchmarks groks. * Octane/splay.js: Added. (this.Setup.setup.setup): (this.TearDown.tearDown.tearDown): (Benchmark): (BenchmarkResult): (BenchmarkResult.prototype.valueOf): (BenchmarkSuite): (alert): (Math.random): (BenchmarkSuite.ResetRNG): (RunStep): (BenchmarkSuite.RunSuites): (BenchmarkSuite.CountBenchmarks): (BenchmarkSuite.GeometricMean): (BenchmarkSuite.GeometricMeanTime): (BenchmarkSuite.AverageAbovePercentile): (BenchmarkSuite.GeometricMeanLatency): (BenchmarkSuite.FormatScore): (BenchmarkSuite.prototype.NotifyStep): (BenchmarkSuite.prototype.NotifyResult): (BenchmarkSuite.prototype.NotifyError): (BenchmarkSuite.prototype.RunSingleBenchmark): (RunNextSetup): (RunNextBenchmark): (RunNextTearDown): (BenchmarkSuite.prototype.RunStep): (GeneratePayloadTree): (GenerateKey): (SplayUpdateStats): (InsertNewNode): (SplaySetup): (SplayTearDown): (SplayRun): (SplayTree): (SplayTree.prototype.isEmpty): (SplayTree.prototype.insert): (SplayTree.prototype.remove): (SplayTree.prototype.find): (SplayTree.prototype.findMax): (SplayTree.prototype.findGreatestLessThan): (SplayTree.prototype.exportKeys): (SplayTree.prototype.splay_): (SplayTree.Node): (SplayTree.Node.prototype.traverse_): (report): (start): Source/JavaScriptCore: This makes it possible to do constraint solving in parallel. This looks like a 1% Speedometer speed-up. It's more than 1% on trunk-Speedometer. The constraint solver supports running constraints in parallel in two different ways: - Run multiple constraints in parallel to each other. This only works for constraints that can tolerate other constraints running concurrently to them (constraint.concurrency() == ConstraintConcurrency::Concurrent). This is the most basic kind of parallelism that the constraint solver supports. All constraints except the JSC SPI constraints are concurrent. We could probably make them concurrent, but I'm playing it safe for now. - A constraint can create parallel work for itself, which the constraint solver will interleave with other stuff. A constraint can report that it has parallel work by returning ConstraintParallelism::Parallel from its executeImpl() function. Then the solver will allow that constraint's doParallelWorkImpl() function to run on as many GC marker threads as are available, for as long as that function wants to run. It's not possible to have a non-concurrent constraint that creates parallel work. The parallelism is implemented in terms of the existing GC marker threads. This turns out to be most natural for two reasons: - No need to start any other threads. - The constraints all want to be passed a SlotVisitor. Running on the marker threads means having access to those threads' SlotVisitors. Also, it means less load balancing. The solver will create work on each marking thread's SlotVisitor. When the solver is done "stealing" a marker thread, that thread will have work it can start doing immediately. Before this change, we had to contribute the work found by the constraint solver to the global worklist so that it could be distributed to the marker threads by load balancing. This change probably helps to avoid that load balancing step. A lot of this change is about making it easy to iterate GC data structures in parallel. This change makes almost all constraints parallel-enabled, but only the DOM's output constraint uses the parallel work API. That constraint iterates the marked cells in two subspaces. This change makes it very easy to compose parallel iterators over subspaces, allocators, blocks, and cells. The marked cell parallel iterator is composed out of parallel iterators for the others. A parallel iterator is just an iterator that can do an atomic next() very quickly. We abstract them using RefPtr<SharedTask<...()>>, where ... is the type returned from the iterator. We know it's done when it returns a falsish version of ... (in the current code, that's always a pointer type, so done is indicated by null). * API/JSMarkingConstraintPrivate.cpp: (JSContextGroupAddMarkingConstraint): * API/JSVirtualMachine.mm: (scanExternalObjectGraph): (scanExternalRememberedSet): * JavaScriptCore.xcodeproj/project.pbxproj: * Sources.txt: * bytecode/AccessCase.cpp: (JSC::AccessCase::propagateTransitions const): * bytecode/CodeBlock.cpp: (JSC::CodeBlock::visitWeakly): (JSC::CodeBlock::shouldJettisonDueToOldAge): (JSC::shouldMarkTransition): (JSC::CodeBlock::propagateTransitions): (JSC::CodeBlock::determineLiveness): * dfg/DFGWorklist.cpp: * ftl/FTLCompile.cpp: (JSC::FTL::compile): * heap/ConstraintParallelism.h: Added. (WTF::printInternal): * heap/Heap.cpp: (JSC::Heap::Heap): (JSC::Heap::addToRememberedSet): (JSC::Heap::runFixpointPhase): (JSC::Heap::stopThePeriphery): (JSC::Heap::resumeThePeriphery): (JSC::Heap::addCoreConstraints): (JSC::Heap::setBonusVisitorTask): (JSC::Heap::runTaskInParallel): (JSC::Heap::forEachSlotVisitor): Deleted. * heap/Heap.h: (JSC::Heap::worldIsRunning const): (JSC::Heap::runFunctionInParallel): * heap/HeapInlines.h: (JSC::Heap::worldIsStopped const): (JSC::Heap::isMarked): (JSC::Heap::incrementDeferralDepth): (JSC::Heap::decrementDeferralDepth): (JSC::Heap::decrementDeferralDepthAndGCIfNeeded): (JSC::Heap::forEachSlotVisitor): (JSC::Heap::collectorBelievesThatTheWorldIsStopped const): Deleted. (JSC::Heap::isMarkedConcurrently): Deleted. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::appendNode): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): Deleted. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::parallelNotEmptyBlockSource): * heap/MarkedAllocator.h: * heap/MarkedBlock.h: (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::areMarksStaleWithDependency): Deleted. (JSC::MarkedBlock::isMarkedConcurrently): Deleted. * heap/MarkedSpace.h: (JSC::MarkedSpace::activeWeakSetsBegin): (JSC::MarkedSpace::activeWeakSetsEnd): (JSC::MarkedSpace::newActiveWeakSetsBegin): (JSC::MarkedSpace::newActiveWeakSetsEnd): * heap/MarkingConstraint.cpp: (JSC::MarkingConstraint::MarkingConstraint): (JSC::MarkingConstraint::execute): (JSC::MarkingConstraint::quickWorkEstimate): (JSC::MarkingConstraint::workEstimate): (JSC::MarkingConstraint::doParallelWork): (JSC::MarkingConstraint::finishParallelWork): (JSC::MarkingConstraint::doParallelWorkImpl): (JSC::MarkingConstraint::finishParallelWorkImpl): * heap/MarkingConstraint.h: (JSC::MarkingConstraint::lastExecuteParallelism const): (JSC::MarkingConstraint::parallelism const): (JSC::MarkingConstraint::quickWorkEstimate): Deleted. (JSC::MarkingConstraint::workEstimate): Deleted. * heap/MarkingConstraintSet.cpp: (JSC::MarkingConstraintSet::MarkingConstraintSet): (JSC::MarkingConstraintSet::add): (JSC::MarkingConstraintSet::executeConvergence): (JSC::MarkingConstraintSet::executeConvergenceImpl): (JSC::MarkingConstraintSet::executeAll): (JSC::MarkingConstraintSet::ExecutionContext::ExecutionContext): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didVisitSomething const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::shouldTimeOut const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::drain): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didExecute const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::execute): Deleted. (): Deleted. * heap/MarkingConstraintSet.h: * heap/MarkingConstraintSolver.cpp: Added. (JSC::MarkingConstraintSolver::MarkingConstraintSolver): (JSC::MarkingConstraintSolver::~MarkingConstraintSolver): (JSC::MarkingConstraintSolver::didVisitSomething const): (JSC::MarkingConstraintSolver::execute): (JSC::MarkingConstraintSolver::drain): (JSC::MarkingConstraintSolver::converge): (JSC::MarkingConstraintSolver::runExecutionThread): (JSC::MarkingConstraintSolver::didExecute): * heap/MarkingConstraintSolver.h: Added. * heap/OpaqueRootSet.h: Removed. * heap/ParallelSourceAdapter.h: Added. (JSC::ParallelSourceAdapter::ParallelSourceAdapter): (JSC::createParallelSourceAdapter): * heap/SimpleMarkingConstraint.cpp: Added. (JSC::SimpleMarkingConstraint::SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::~SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::quickWorkEstimate): (JSC::SimpleMarkingConstraint::executeImpl): * heap/SimpleMarkingConstraint.h: Added. * heap/SlotVisitor.cpp: (JSC::SlotVisitor::didStartMarking): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::updateMutatorIsStopped): (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate const): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::performIncrementOfDraining): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): (JSC::SlotVisitor::waitForTermination): (JSC::SlotVisitor::addOpaqueRoot): Deleted. (JSC::SlotVisitor::containsOpaqueRoot const): Deleted. (JSC::SlotVisitor::containsOpaqueRootTriState const): Deleted. (JSC::SlotVisitor::mergeIfNecessary): Deleted. (JSC::SlotVisitor::mergeOpaqueRootsIfProfitable): Deleted. (JSC::SlotVisitor::mergeOpaqueRoots): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::addOpaqueRoot): (JSC::SlotVisitor::containsOpaqueRoot const): (JSC::SlotVisitor::vm): (JSC::SlotVisitor::vm const): * heap/Subspace.cpp: (JSC::Subspace::parallelAllocatorSource): (JSC::Subspace::parallelNotEmptyMarkedBlockSource): * heap/Subspace.h: * heap/SubspaceInlines.h: (JSC::Subspace::forEachMarkedCellInParallel): * heap/VisitCounter.h: Added. (JSC::VisitCounter::VisitCounter): (JSC::VisitCounter::visitCount const): * heap/VisitingTimeout.h: Removed. * heap/WeakBlock.cpp: (JSC::WeakBlock::specializedVisit): * runtime/Structure.cpp: (JSC::Structure::isCheapDuringGC): (JSC::Structure::markIfCheap): Source/WebCore: No new tests because no change in behavior. This change is best tested using DOM-GC-intensive benchmarks like Speedometer and Dromaeo. This parallelizes the DOM's output constraint, and makes some small changes to make this more scalable. * ForwardingHeaders/heap/SimpleMarkingConstraint.h: Added. * ForwardingHeaders/heap/VisitingTimeout.h: Removed. * Sources.txt: * WebCore.xcodeproj/project.pbxproj: * bindings/js/DOMGCOutputConstraint.cpp: Added. (WebCore::DOMGCOutputConstraint::DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::~DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::executeImpl): (WebCore::DOMGCOutputConstraint::doParallelWorkImpl): (WebCore::DOMGCOutputConstraint::finishParallelWorkImpl): * bindings/js/DOMGCOutputConstraint.h: Added. * bindings/js/WebCoreJSClientData.cpp: (WebCore::JSVMClientData::initNormalWorld): * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): Source/WTF: This does some changes to make it easier to do parallel constraint solving: - I finally removed dependencyWith. This was a silly construct whose only purpose is to confuse people about what it means to have a dependency chain. I took that as an opportunity to grealy simplify the GC's use of dependency chaining. - Added more logic to Deque<>, since I use it for part of the load balancer. - Made it possible to profile lock contention. See https://bugs.webkit.org/show_bug.cgi?id=180250#c0 for some preliminary measurements. - Introduced holdLockIf, which makes it easy to perform predicated lock acquisition. We use that to pick a lock in WebCore. - Introduced CountingLock. It's like WTF::Lock except it also enables optimistic read transactions sorta like Java's StampedLock. * WTF.xcodeproj/project.pbxproj: * wtf/Atomics.h: (WTF::dependency): (WTF::DependencyWith::DependencyWith): Deleted. (WTF::dependencyWith): Deleted. * wtf/BitVector.h: (WTF::BitVector::iterator::operator++): * wtf/CMakeLists.txt: * wtf/ConcurrentPtrHashSet.cpp: Added. (WTF::ConcurrentPtrHashSet::ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::~ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::deleteOldTables): (WTF::ConcurrentPtrHashSet::clear): (WTF::ConcurrentPtrHashSet::initialize): (WTF::ConcurrentPtrHashSet::addSlow): (WTF::ConcurrentPtrHashSet::resizeIfNecessary): (WTF::ConcurrentPtrHashSet::resizeAndAdd): (WTF::ConcurrentPtrHashSet::Table::create): * wtf/ConcurrentPtrHashSet.h: Added. (WTF::ConcurrentPtrHashSet::contains): (WTF::ConcurrentPtrHashSet::add): (WTF::ConcurrentPtrHashSet::size const): (WTF::ConcurrentPtrHashSet::Table::maxLoad const): (WTF::ConcurrentPtrHashSet::hash): (WTF::ConcurrentPtrHashSet::cast): (WTF::ConcurrentPtrHashSet::containsImpl const): (WTF::ConcurrentPtrHashSet::addImpl): * wtf/Deque.h: (WTF::inlineCapacity>::takeFirst): * wtf/FastMalloc.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): * wtf/Locker.h: (WTF::holdLockIf): * wtf/ScopedLambda.h: * wtf/SharedTask.h: (WTF::SharedTask<PassedResultType): (WTF::SharedTask<ResultType): Deleted. * wtf/StackShot.h: Added. (WTF::StackShot::StackShot): (WTF::StackShot::operator=): (WTF::StackShot::array const): (WTF::StackShot::size const): (WTF::StackShot::operator bool const): (WTF::StackShot::operator== const): (WTF::StackShot::hash const): (WTF::StackShot::isHashTableDeletedValue const): (WTF::StackShot::operator> const): (WTF::StackShot::deletedValueArray): (WTF::StackShotHash::hash): (WTF::StackShotHash::equal): * wtf/StackShotProfiler.h: Added. (WTF::StackShotProfiler::StackShotProfiler): (WTF::StackShotProfiler::profile): (WTF::StackShotProfiler::run): Tools: * Scripts/run-jsc-benchmarks: Add splay-latency test, since this change needed to be carefully validated with that benchmark. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/ConcurrentPtrHashSet.cpp: Added. This has unit tests of the new concurrent data structure. The tests focus on correctness under serial execution, which appears to be enough for now (it's so easy to catch a concurrency bug by just running the GC). (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/196360@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225524 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-05 17:53:57 +00:00
value = Hooks::lockHook(value);
WTF should make it super easy to do ARM concurrency tricks https://bugs.webkit.org/show_bug.cgi?id=169300 Reviewed by Mark Lam. Source/JavaScriptCore: This changes a bunch of GC hot paths to use new concurrency APIs that lead to optimal code on both x86 (fully leverage TSO, transactions become CAS loops) and ARM (use dependency chains for fencing, transactions become LL/SC loops). While inspecting the machine code, I found other opportunities for improvement, like inlining the "am I marked" part of the marking functions. * heap/Heap.cpp: (JSC::Heap::setGCDidJIT): * heap/HeapInlines.h: (JSC::Heap::testAndSetMarked): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): (JSC::LargeAllocation::aboutToMark): (JSC::LargeAllocation::testAndSetMarked): * heap/MarkedBlock.h: (JSC::MarkedBlock::areMarksStaleWithDependency): (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarkedConcurrently): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::testAndSetMarked): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::appendSlow): (JSC::SlotVisitor::appendHiddenSlow): (JSC::SlotVisitor::appendHiddenSlowImpl): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendUnbarriered): Deleted. (JSC::SlotVisitor::appendHidden): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::appendUnbarriered): (JSC::SlotVisitor::appendHidden): (JSC::SlotVisitor::append): (JSC::SlotVisitor::appendValues): (JSC::SlotVisitor::appendValuesHidden): * runtime/CustomGetterSetter.cpp: * runtime/JSObject.cpp: (JSC::JSObject::visitButterflyImpl): * runtime/JSObject.h: Source/WTF: This adds Atomic<>::loadLink and Atomic<>::storeCond, available only when HAVE(LL_SC). It abstracts loadLink/storeCond behind prepare/attempt. You can write prepare/attempt loops whenever your loop fits into the least common denominator of LL/SC and CAS. This modifies Atomic<>::transaction to use prepare/attempt. So, if you write your loop using Atomic<>::transaction, then you get LL/SC for free. Depending on the kind of transaction you are doing, you may not want to perform an LL until you have a chance to just load the current value. Atomic<>::transaction() assumes that you do not care to have any ordering guarantees in that case. If you think that the transaction has a good chance of aborting this way, you want Atomic<>::transaction() to first do a plain load. But if you don't think that such an abort is likely, then you want to go straight to the LL. The API supports this concept via TransactionAbortLikelihood. Additionally, this redoes the depend/consume API to be dead simple. Dependency is unsigned. You get a dependency on a loaded value by just saying dependency(loadedValue). You consume the dependency by using it as a bonus index to some pointer dereference. This is made easy with the consume<T*>(ptr, dependency) helper. In those cases where you want to pass around both a computed value and a dependency, there's DependencyWith<T>. But you won't need it in most cases. The loaded value or any value computed from the loaded value is a fine input to dependency()! This change updates a bunch of hot paths to use the new APIs. Using transaction() gives us optimal LL/SC loops for object marking and lock acquisition. This change also updates a bunch of hot paths to use dependency()/consume(). This is a significant Octane/splay speed-up on ARM. * wtf/Atomics.h: (WTF::hasFence): (WTF::Atomic::prepare): (WTF::Atomic::attempt): (WTF::Atomic::transaction): (WTF::Atomic::transactionRelaxed): (WTF::nullDependency): (WTF::dependency): (WTF::DependencyWith::DependencyWith): (WTF::dependencyWith): (WTF::consume): (WTF::Atomic::tryTransactionRelaxed): Deleted. (WTF::Atomic::tryTransaction): Deleted. (WTF::zeroWithConsumeDependency): Deleted. (WTF::consumeLoad): Deleted. * wtf/Bitmap.h: (WTF::WordType>::get): (WTF::WordType>::concurrentTestAndSet): (WTF::WordType>::concurrentTestAndClear): * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlockSlow): * wtf/Platform.h: Tools: This vastly simplifies the consume API. The new API is thoroughly tested by being used in the GC's guts. I think that unit tests are a pain to maintain, so we shouldn't have them unless we are legitimately worried about coverage. We're not in this case. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Consume.cpp: Removed. Canonical link: https://commits.webkit.org/186402@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@213645 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-03-09 17:40:10 +00:00
return true;
},
std::memory_order_acquire);
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
}
static void lock(Atomic<LockType>& lock)
{
if (UNLIKELY(!lockFast(lock)))
lockSlow(lock);
}
static bool tryLock(Atomic<LockType>& lock)
{
for (;;) {
GC constraint solving should be parallel https://bugs.webkit.org/show_bug.cgi?id=179934 Reviewed by JF Bastien. PerformanceTests: Added a version of splay that measures latency in a way that run-jsc-benchmarks groks. * Octane/splay.js: Added. (this.Setup.setup.setup): (this.TearDown.tearDown.tearDown): (Benchmark): (BenchmarkResult): (BenchmarkResult.prototype.valueOf): (BenchmarkSuite): (alert): (Math.random): (BenchmarkSuite.ResetRNG): (RunStep): (BenchmarkSuite.RunSuites): (BenchmarkSuite.CountBenchmarks): (BenchmarkSuite.GeometricMean): (BenchmarkSuite.GeometricMeanTime): (BenchmarkSuite.AverageAbovePercentile): (BenchmarkSuite.GeometricMeanLatency): (BenchmarkSuite.FormatScore): (BenchmarkSuite.prototype.NotifyStep): (BenchmarkSuite.prototype.NotifyResult): (BenchmarkSuite.prototype.NotifyError): (BenchmarkSuite.prototype.RunSingleBenchmark): (RunNextSetup): (RunNextBenchmark): (RunNextTearDown): (BenchmarkSuite.prototype.RunStep): (GeneratePayloadTree): (GenerateKey): (SplayUpdateStats): (InsertNewNode): (SplaySetup): (SplayTearDown): (SplayRun): (SplayTree): (SplayTree.prototype.isEmpty): (SplayTree.prototype.insert): (SplayTree.prototype.remove): (SplayTree.prototype.find): (SplayTree.prototype.findMax): (SplayTree.prototype.findGreatestLessThan): (SplayTree.prototype.exportKeys): (SplayTree.prototype.splay_): (SplayTree.Node): (SplayTree.Node.prototype.traverse_): (report): (start): Source/JavaScriptCore: This makes it possible to do constraint solving in parallel. This looks like a 1% Speedometer speed-up. It's more than 1% on trunk-Speedometer. The constraint solver supports running constraints in parallel in two different ways: - Run multiple constraints in parallel to each other. This only works for constraints that can tolerate other constraints running concurrently to them (constraint.concurrency() == ConstraintConcurrency::Concurrent). This is the most basic kind of parallelism that the constraint solver supports. All constraints except the JSC SPI constraints are concurrent. We could probably make them concurrent, but I'm playing it safe for now. - A constraint can create parallel work for itself, which the constraint solver will interleave with other stuff. A constraint can report that it has parallel work by returning ConstraintParallelism::Parallel from its executeImpl() function. Then the solver will allow that constraint's doParallelWorkImpl() function to run on as many GC marker threads as are available, for as long as that function wants to run. It's not possible to have a non-concurrent constraint that creates parallel work. The parallelism is implemented in terms of the existing GC marker threads. This turns out to be most natural for two reasons: - No need to start any other threads. - The constraints all want to be passed a SlotVisitor. Running on the marker threads means having access to those threads' SlotVisitors. Also, it means less load balancing. The solver will create work on each marking thread's SlotVisitor. When the solver is done "stealing" a marker thread, that thread will have work it can start doing immediately. Before this change, we had to contribute the work found by the constraint solver to the global worklist so that it could be distributed to the marker threads by load balancing. This change probably helps to avoid that load balancing step. A lot of this change is about making it easy to iterate GC data structures in parallel. This change makes almost all constraints parallel-enabled, but only the DOM's output constraint uses the parallel work API. That constraint iterates the marked cells in two subspaces. This change makes it very easy to compose parallel iterators over subspaces, allocators, blocks, and cells. The marked cell parallel iterator is composed out of parallel iterators for the others. A parallel iterator is just an iterator that can do an atomic next() very quickly. We abstract them using RefPtr<SharedTask<...()>>, where ... is the type returned from the iterator. We know it's done when it returns a falsish version of ... (in the current code, that's always a pointer type, so done is indicated by null). * API/JSMarkingConstraintPrivate.cpp: (JSContextGroupAddMarkingConstraint): * API/JSVirtualMachine.mm: (scanExternalObjectGraph): (scanExternalRememberedSet): * JavaScriptCore.xcodeproj/project.pbxproj: * Sources.txt: * bytecode/AccessCase.cpp: (JSC::AccessCase::propagateTransitions const): * bytecode/CodeBlock.cpp: (JSC::CodeBlock::visitWeakly): (JSC::CodeBlock::shouldJettisonDueToOldAge): (JSC::shouldMarkTransition): (JSC::CodeBlock::propagateTransitions): (JSC::CodeBlock::determineLiveness): * dfg/DFGWorklist.cpp: * ftl/FTLCompile.cpp: (JSC::FTL::compile): * heap/ConstraintParallelism.h: Added. (WTF::printInternal): * heap/Heap.cpp: (JSC::Heap::Heap): (JSC::Heap::addToRememberedSet): (JSC::Heap::runFixpointPhase): (JSC::Heap::stopThePeriphery): (JSC::Heap::resumeThePeriphery): (JSC::Heap::addCoreConstraints): (JSC::Heap::setBonusVisitorTask): (JSC::Heap::runTaskInParallel): (JSC::Heap::forEachSlotVisitor): Deleted. * heap/Heap.h: (JSC::Heap::worldIsRunning const): (JSC::Heap::runFunctionInParallel): * heap/HeapInlines.h: (JSC::Heap::worldIsStopped const): (JSC::Heap::isMarked): (JSC::Heap::incrementDeferralDepth): (JSC::Heap::decrementDeferralDepth): (JSC::Heap::decrementDeferralDepthAndGCIfNeeded): (JSC::Heap::forEachSlotVisitor): (JSC::Heap::collectorBelievesThatTheWorldIsStopped const): Deleted. (JSC::Heap::isMarkedConcurrently): Deleted. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::appendNode): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): Deleted. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::parallelNotEmptyBlockSource): * heap/MarkedAllocator.h: * heap/MarkedBlock.h: (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::areMarksStaleWithDependency): Deleted. (JSC::MarkedBlock::isMarkedConcurrently): Deleted. * heap/MarkedSpace.h: (JSC::MarkedSpace::activeWeakSetsBegin): (JSC::MarkedSpace::activeWeakSetsEnd): (JSC::MarkedSpace::newActiveWeakSetsBegin): (JSC::MarkedSpace::newActiveWeakSetsEnd): * heap/MarkingConstraint.cpp: (JSC::MarkingConstraint::MarkingConstraint): (JSC::MarkingConstraint::execute): (JSC::MarkingConstraint::quickWorkEstimate): (JSC::MarkingConstraint::workEstimate): (JSC::MarkingConstraint::doParallelWork): (JSC::MarkingConstraint::finishParallelWork): (JSC::MarkingConstraint::doParallelWorkImpl): (JSC::MarkingConstraint::finishParallelWorkImpl): * heap/MarkingConstraint.h: (JSC::MarkingConstraint::lastExecuteParallelism const): (JSC::MarkingConstraint::parallelism const): (JSC::MarkingConstraint::quickWorkEstimate): Deleted. (JSC::MarkingConstraint::workEstimate): Deleted. * heap/MarkingConstraintSet.cpp: (JSC::MarkingConstraintSet::MarkingConstraintSet): (JSC::MarkingConstraintSet::add): (JSC::MarkingConstraintSet::executeConvergence): (JSC::MarkingConstraintSet::executeConvergenceImpl): (JSC::MarkingConstraintSet::executeAll): (JSC::MarkingConstraintSet::ExecutionContext::ExecutionContext): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didVisitSomething const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::shouldTimeOut const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::drain): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didExecute const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::execute): Deleted. (): Deleted. * heap/MarkingConstraintSet.h: * heap/MarkingConstraintSolver.cpp: Added. (JSC::MarkingConstraintSolver::MarkingConstraintSolver): (JSC::MarkingConstraintSolver::~MarkingConstraintSolver): (JSC::MarkingConstraintSolver::didVisitSomething const): (JSC::MarkingConstraintSolver::execute): (JSC::MarkingConstraintSolver::drain): (JSC::MarkingConstraintSolver::converge): (JSC::MarkingConstraintSolver::runExecutionThread): (JSC::MarkingConstraintSolver::didExecute): * heap/MarkingConstraintSolver.h: Added. * heap/OpaqueRootSet.h: Removed. * heap/ParallelSourceAdapter.h: Added. (JSC::ParallelSourceAdapter::ParallelSourceAdapter): (JSC::createParallelSourceAdapter): * heap/SimpleMarkingConstraint.cpp: Added. (JSC::SimpleMarkingConstraint::SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::~SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::quickWorkEstimate): (JSC::SimpleMarkingConstraint::executeImpl): * heap/SimpleMarkingConstraint.h: Added. * heap/SlotVisitor.cpp: (JSC::SlotVisitor::didStartMarking): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::updateMutatorIsStopped): (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate const): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::performIncrementOfDraining): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): (JSC::SlotVisitor::waitForTermination): (JSC::SlotVisitor::addOpaqueRoot): Deleted. (JSC::SlotVisitor::containsOpaqueRoot const): Deleted. (JSC::SlotVisitor::containsOpaqueRootTriState const): Deleted. (JSC::SlotVisitor::mergeIfNecessary): Deleted. (JSC::SlotVisitor::mergeOpaqueRootsIfProfitable): Deleted. (JSC::SlotVisitor::mergeOpaqueRoots): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::addOpaqueRoot): (JSC::SlotVisitor::containsOpaqueRoot const): (JSC::SlotVisitor::vm): (JSC::SlotVisitor::vm const): * heap/Subspace.cpp: (JSC::Subspace::parallelAllocatorSource): (JSC::Subspace::parallelNotEmptyMarkedBlockSource): * heap/Subspace.h: * heap/SubspaceInlines.h: (JSC::Subspace::forEachMarkedCellInParallel): * heap/VisitCounter.h: Added. (JSC::VisitCounter::VisitCounter): (JSC::VisitCounter::visitCount const): * heap/VisitingTimeout.h: Removed. * heap/WeakBlock.cpp: (JSC::WeakBlock::specializedVisit): * runtime/Structure.cpp: (JSC::Structure::isCheapDuringGC): (JSC::Structure::markIfCheap): Source/WebCore: No new tests because no change in behavior. This change is best tested using DOM-GC-intensive benchmarks like Speedometer and Dromaeo. This parallelizes the DOM's output constraint, and makes some small changes to make this more scalable. * ForwardingHeaders/heap/SimpleMarkingConstraint.h: Added. * ForwardingHeaders/heap/VisitingTimeout.h: Removed. * Sources.txt: * WebCore.xcodeproj/project.pbxproj: * bindings/js/DOMGCOutputConstraint.cpp: Added. (WebCore::DOMGCOutputConstraint::DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::~DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::executeImpl): (WebCore::DOMGCOutputConstraint::doParallelWorkImpl): (WebCore::DOMGCOutputConstraint::finishParallelWorkImpl): * bindings/js/DOMGCOutputConstraint.h: Added. * bindings/js/WebCoreJSClientData.cpp: (WebCore::JSVMClientData::initNormalWorld): * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): Source/WTF: This does some changes to make it easier to do parallel constraint solving: - I finally removed dependencyWith. This was a silly construct whose only purpose is to confuse people about what it means to have a dependency chain. I took that as an opportunity to grealy simplify the GC's use of dependency chaining. - Added more logic to Deque<>, since I use it for part of the load balancer. - Made it possible to profile lock contention. See https://bugs.webkit.org/show_bug.cgi?id=180250#c0 for some preliminary measurements. - Introduced holdLockIf, which makes it easy to perform predicated lock acquisition. We use that to pick a lock in WebCore. - Introduced CountingLock. It's like WTF::Lock except it also enables optimistic read transactions sorta like Java's StampedLock. * WTF.xcodeproj/project.pbxproj: * wtf/Atomics.h: (WTF::dependency): (WTF::DependencyWith::DependencyWith): Deleted. (WTF::dependencyWith): Deleted. * wtf/BitVector.h: (WTF::BitVector::iterator::operator++): * wtf/CMakeLists.txt: * wtf/ConcurrentPtrHashSet.cpp: Added. (WTF::ConcurrentPtrHashSet::ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::~ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::deleteOldTables): (WTF::ConcurrentPtrHashSet::clear): (WTF::ConcurrentPtrHashSet::initialize): (WTF::ConcurrentPtrHashSet::addSlow): (WTF::ConcurrentPtrHashSet::resizeIfNecessary): (WTF::ConcurrentPtrHashSet::resizeAndAdd): (WTF::ConcurrentPtrHashSet::Table::create): * wtf/ConcurrentPtrHashSet.h: Added. (WTF::ConcurrentPtrHashSet::contains): (WTF::ConcurrentPtrHashSet::add): (WTF::ConcurrentPtrHashSet::size const): (WTF::ConcurrentPtrHashSet::Table::maxLoad const): (WTF::ConcurrentPtrHashSet::hash): (WTF::ConcurrentPtrHashSet::cast): (WTF::ConcurrentPtrHashSet::containsImpl const): (WTF::ConcurrentPtrHashSet::addImpl): * wtf/Deque.h: (WTF::inlineCapacity>::takeFirst): * wtf/FastMalloc.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): * wtf/Locker.h: (WTF::holdLockIf): * wtf/ScopedLambda.h: * wtf/SharedTask.h: (WTF::SharedTask<PassedResultType): (WTF::SharedTask<ResultType): Deleted. * wtf/StackShot.h: Added. (WTF::StackShot::StackShot): (WTF::StackShot::operator=): (WTF::StackShot::array const): (WTF::StackShot::size const): (WTF::StackShot::operator bool const): (WTF::StackShot::operator== const): (WTF::StackShot::hash const): (WTF::StackShot::isHashTableDeletedValue const): (WTF::StackShot::operator> const): (WTF::StackShot::deletedValueArray): (WTF::StackShotHash::hash): (WTF::StackShotHash::equal): * wtf/StackShotProfiler.h: Added. (WTF::StackShotProfiler::StackShotProfiler): (WTF::StackShotProfiler::profile): (WTF::StackShotProfiler::run): Tools: * Scripts/run-jsc-benchmarks: Add splay-latency test, since this change needed to be carefully validated with that benchmark. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/ConcurrentPtrHashSet.cpp: Added. This has unit tests of the new concurrent data structure. The tests focus on correctness under serial execution, which appears to be enough for now (it's so easy to catch a concurrency bug by just running the GC). (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/196360@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225524 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-05 17:53:57 +00:00
LockType currentValue = lock.load(std::memory_order_relaxed);
if (currentValue & isHeldBit)
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
return false;
GC constraint solving should be parallel https://bugs.webkit.org/show_bug.cgi?id=179934 Reviewed by JF Bastien. PerformanceTests: Added a version of splay that measures latency in a way that run-jsc-benchmarks groks. * Octane/splay.js: Added. (this.Setup.setup.setup): (this.TearDown.tearDown.tearDown): (Benchmark): (BenchmarkResult): (BenchmarkResult.prototype.valueOf): (BenchmarkSuite): (alert): (Math.random): (BenchmarkSuite.ResetRNG): (RunStep): (BenchmarkSuite.RunSuites): (BenchmarkSuite.CountBenchmarks): (BenchmarkSuite.GeometricMean): (BenchmarkSuite.GeometricMeanTime): (BenchmarkSuite.AverageAbovePercentile): (BenchmarkSuite.GeometricMeanLatency): (BenchmarkSuite.FormatScore): (BenchmarkSuite.prototype.NotifyStep): (BenchmarkSuite.prototype.NotifyResult): (BenchmarkSuite.prototype.NotifyError): (BenchmarkSuite.prototype.RunSingleBenchmark): (RunNextSetup): (RunNextBenchmark): (RunNextTearDown): (BenchmarkSuite.prototype.RunStep): (GeneratePayloadTree): (GenerateKey): (SplayUpdateStats): (InsertNewNode): (SplaySetup): (SplayTearDown): (SplayRun): (SplayTree): (SplayTree.prototype.isEmpty): (SplayTree.prototype.insert): (SplayTree.prototype.remove): (SplayTree.prototype.find): (SplayTree.prototype.findMax): (SplayTree.prototype.findGreatestLessThan): (SplayTree.prototype.exportKeys): (SplayTree.prototype.splay_): (SplayTree.Node): (SplayTree.Node.prototype.traverse_): (report): (start): Source/JavaScriptCore: This makes it possible to do constraint solving in parallel. This looks like a 1% Speedometer speed-up. It's more than 1% on trunk-Speedometer. The constraint solver supports running constraints in parallel in two different ways: - Run multiple constraints in parallel to each other. This only works for constraints that can tolerate other constraints running concurrently to them (constraint.concurrency() == ConstraintConcurrency::Concurrent). This is the most basic kind of parallelism that the constraint solver supports. All constraints except the JSC SPI constraints are concurrent. We could probably make them concurrent, but I'm playing it safe for now. - A constraint can create parallel work for itself, which the constraint solver will interleave with other stuff. A constraint can report that it has parallel work by returning ConstraintParallelism::Parallel from its executeImpl() function. Then the solver will allow that constraint's doParallelWorkImpl() function to run on as many GC marker threads as are available, for as long as that function wants to run. It's not possible to have a non-concurrent constraint that creates parallel work. The parallelism is implemented in terms of the existing GC marker threads. This turns out to be most natural for two reasons: - No need to start any other threads. - The constraints all want to be passed a SlotVisitor. Running on the marker threads means having access to those threads' SlotVisitors. Also, it means less load balancing. The solver will create work on each marking thread's SlotVisitor. When the solver is done "stealing" a marker thread, that thread will have work it can start doing immediately. Before this change, we had to contribute the work found by the constraint solver to the global worklist so that it could be distributed to the marker threads by load balancing. This change probably helps to avoid that load balancing step. A lot of this change is about making it easy to iterate GC data structures in parallel. This change makes almost all constraints parallel-enabled, but only the DOM's output constraint uses the parallel work API. That constraint iterates the marked cells in two subspaces. This change makes it very easy to compose parallel iterators over subspaces, allocators, blocks, and cells. The marked cell parallel iterator is composed out of parallel iterators for the others. A parallel iterator is just an iterator that can do an atomic next() very quickly. We abstract them using RefPtr<SharedTask<...()>>, where ... is the type returned from the iterator. We know it's done when it returns a falsish version of ... (in the current code, that's always a pointer type, so done is indicated by null). * API/JSMarkingConstraintPrivate.cpp: (JSContextGroupAddMarkingConstraint): * API/JSVirtualMachine.mm: (scanExternalObjectGraph): (scanExternalRememberedSet): * JavaScriptCore.xcodeproj/project.pbxproj: * Sources.txt: * bytecode/AccessCase.cpp: (JSC::AccessCase::propagateTransitions const): * bytecode/CodeBlock.cpp: (JSC::CodeBlock::visitWeakly): (JSC::CodeBlock::shouldJettisonDueToOldAge): (JSC::shouldMarkTransition): (JSC::CodeBlock::propagateTransitions): (JSC::CodeBlock::determineLiveness): * dfg/DFGWorklist.cpp: * ftl/FTLCompile.cpp: (JSC::FTL::compile): * heap/ConstraintParallelism.h: Added. (WTF::printInternal): * heap/Heap.cpp: (JSC::Heap::Heap): (JSC::Heap::addToRememberedSet): (JSC::Heap::runFixpointPhase): (JSC::Heap::stopThePeriphery): (JSC::Heap::resumeThePeriphery): (JSC::Heap::addCoreConstraints): (JSC::Heap::setBonusVisitorTask): (JSC::Heap::runTaskInParallel): (JSC::Heap::forEachSlotVisitor): Deleted. * heap/Heap.h: (JSC::Heap::worldIsRunning const): (JSC::Heap::runFunctionInParallel): * heap/HeapInlines.h: (JSC::Heap::worldIsStopped const): (JSC::Heap::isMarked): (JSC::Heap::incrementDeferralDepth): (JSC::Heap::decrementDeferralDepth): (JSC::Heap::decrementDeferralDepthAndGCIfNeeded): (JSC::Heap::forEachSlotVisitor): (JSC::Heap::collectorBelievesThatTheWorldIsStopped const): Deleted. (JSC::Heap::isMarkedConcurrently): Deleted. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::appendNode): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): Deleted. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::parallelNotEmptyBlockSource): * heap/MarkedAllocator.h: * heap/MarkedBlock.h: (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::areMarksStaleWithDependency): Deleted. (JSC::MarkedBlock::isMarkedConcurrently): Deleted. * heap/MarkedSpace.h: (JSC::MarkedSpace::activeWeakSetsBegin): (JSC::MarkedSpace::activeWeakSetsEnd): (JSC::MarkedSpace::newActiveWeakSetsBegin): (JSC::MarkedSpace::newActiveWeakSetsEnd): * heap/MarkingConstraint.cpp: (JSC::MarkingConstraint::MarkingConstraint): (JSC::MarkingConstraint::execute): (JSC::MarkingConstraint::quickWorkEstimate): (JSC::MarkingConstraint::workEstimate): (JSC::MarkingConstraint::doParallelWork): (JSC::MarkingConstraint::finishParallelWork): (JSC::MarkingConstraint::doParallelWorkImpl): (JSC::MarkingConstraint::finishParallelWorkImpl): * heap/MarkingConstraint.h: (JSC::MarkingConstraint::lastExecuteParallelism const): (JSC::MarkingConstraint::parallelism const): (JSC::MarkingConstraint::quickWorkEstimate): Deleted. (JSC::MarkingConstraint::workEstimate): Deleted. * heap/MarkingConstraintSet.cpp: (JSC::MarkingConstraintSet::MarkingConstraintSet): (JSC::MarkingConstraintSet::add): (JSC::MarkingConstraintSet::executeConvergence): (JSC::MarkingConstraintSet::executeConvergenceImpl): (JSC::MarkingConstraintSet::executeAll): (JSC::MarkingConstraintSet::ExecutionContext::ExecutionContext): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didVisitSomething const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::shouldTimeOut const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::drain): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didExecute const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::execute): Deleted. (): Deleted. * heap/MarkingConstraintSet.h: * heap/MarkingConstraintSolver.cpp: Added. (JSC::MarkingConstraintSolver::MarkingConstraintSolver): (JSC::MarkingConstraintSolver::~MarkingConstraintSolver): (JSC::MarkingConstraintSolver::didVisitSomething const): (JSC::MarkingConstraintSolver::execute): (JSC::MarkingConstraintSolver::drain): (JSC::MarkingConstraintSolver::converge): (JSC::MarkingConstraintSolver::runExecutionThread): (JSC::MarkingConstraintSolver::didExecute): * heap/MarkingConstraintSolver.h: Added. * heap/OpaqueRootSet.h: Removed. * heap/ParallelSourceAdapter.h: Added. (JSC::ParallelSourceAdapter::ParallelSourceAdapter): (JSC::createParallelSourceAdapter): * heap/SimpleMarkingConstraint.cpp: Added. (JSC::SimpleMarkingConstraint::SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::~SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::quickWorkEstimate): (JSC::SimpleMarkingConstraint::executeImpl): * heap/SimpleMarkingConstraint.h: Added. * heap/SlotVisitor.cpp: (JSC::SlotVisitor::didStartMarking): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::updateMutatorIsStopped): (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate const): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::performIncrementOfDraining): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): (JSC::SlotVisitor::waitForTermination): (JSC::SlotVisitor::addOpaqueRoot): Deleted. (JSC::SlotVisitor::containsOpaqueRoot const): Deleted. (JSC::SlotVisitor::containsOpaqueRootTriState const): Deleted. (JSC::SlotVisitor::mergeIfNecessary): Deleted. (JSC::SlotVisitor::mergeOpaqueRootsIfProfitable): Deleted. (JSC::SlotVisitor::mergeOpaqueRoots): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::addOpaqueRoot): (JSC::SlotVisitor::containsOpaqueRoot const): (JSC::SlotVisitor::vm): (JSC::SlotVisitor::vm const): * heap/Subspace.cpp: (JSC::Subspace::parallelAllocatorSource): (JSC::Subspace::parallelNotEmptyMarkedBlockSource): * heap/Subspace.h: * heap/SubspaceInlines.h: (JSC::Subspace::forEachMarkedCellInParallel): * heap/VisitCounter.h: Added. (JSC::VisitCounter::VisitCounter): (JSC::VisitCounter::visitCount const): * heap/VisitingTimeout.h: Removed. * heap/WeakBlock.cpp: (JSC::WeakBlock::specializedVisit): * runtime/Structure.cpp: (JSC::Structure::isCheapDuringGC): (JSC::Structure::markIfCheap): Source/WebCore: No new tests because no change in behavior. This change is best tested using DOM-GC-intensive benchmarks like Speedometer and Dromaeo. This parallelizes the DOM's output constraint, and makes some small changes to make this more scalable. * ForwardingHeaders/heap/SimpleMarkingConstraint.h: Added. * ForwardingHeaders/heap/VisitingTimeout.h: Removed. * Sources.txt: * WebCore.xcodeproj/project.pbxproj: * bindings/js/DOMGCOutputConstraint.cpp: Added. (WebCore::DOMGCOutputConstraint::DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::~DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::executeImpl): (WebCore::DOMGCOutputConstraint::doParallelWorkImpl): (WebCore::DOMGCOutputConstraint::finishParallelWorkImpl): * bindings/js/DOMGCOutputConstraint.h: Added. * bindings/js/WebCoreJSClientData.cpp: (WebCore::JSVMClientData::initNormalWorld): * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): Source/WTF: This does some changes to make it easier to do parallel constraint solving: - I finally removed dependencyWith. This was a silly construct whose only purpose is to confuse people about what it means to have a dependency chain. I took that as an opportunity to grealy simplify the GC's use of dependency chaining. - Added more logic to Deque<>, since I use it for part of the load balancer. - Made it possible to profile lock contention. See https://bugs.webkit.org/show_bug.cgi?id=180250#c0 for some preliminary measurements. - Introduced holdLockIf, which makes it easy to perform predicated lock acquisition. We use that to pick a lock in WebCore. - Introduced CountingLock. It's like WTF::Lock except it also enables optimistic read transactions sorta like Java's StampedLock. * WTF.xcodeproj/project.pbxproj: * wtf/Atomics.h: (WTF::dependency): (WTF::DependencyWith::DependencyWith): Deleted. (WTF::dependencyWith): Deleted. * wtf/BitVector.h: (WTF::BitVector::iterator::operator++): * wtf/CMakeLists.txt: * wtf/ConcurrentPtrHashSet.cpp: Added. (WTF::ConcurrentPtrHashSet::ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::~ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::deleteOldTables): (WTF::ConcurrentPtrHashSet::clear): (WTF::ConcurrentPtrHashSet::initialize): (WTF::ConcurrentPtrHashSet::addSlow): (WTF::ConcurrentPtrHashSet::resizeIfNecessary): (WTF::ConcurrentPtrHashSet::resizeAndAdd): (WTF::ConcurrentPtrHashSet::Table::create): * wtf/ConcurrentPtrHashSet.h: Added. (WTF::ConcurrentPtrHashSet::contains): (WTF::ConcurrentPtrHashSet::add): (WTF::ConcurrentPtrHashSet::size const): (WTF::ConcurrentPtrHashSet::Table::maxLoad const): (WTF::ConcurrentPtrHashSet::hash): (WTF::ConcurrentPtrHashSet::cast): (WTF::ConcurrentPtrHashSet::containsImpl const): (WTF::ConcurrentPtrHashSet::addImpl): * wtf/Deque.h: (WTF::inlineCapacity>::takeFirst): * wtf/FastMalloc.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): * wtf/Locker.h: (WTF::holdLockIf): * wtf/ScopedLambda.h: * wtf/SharedTask.h: (WTF::SharedTask<PassedResultType): (WTF::SharedTask<ResultType): Deleted. * wtf/StackShot.h: Added. (WTF::StackShot::StackShot): (WTF::StackShot::operator=): (WTF::StackShot::array const): (WTF::StackShot::size const): (WTF::StackShot::operator bool const): (WTF::StackShot::operator== const): (WTF::StackShot::hash const): (WTF::StackShot::isHashTableDeletedValue const): (WTF::StackShot::operator> const): (WTF::StackShot::deletedValueArray): (WTF::StackShotHash::hash): (WTF::StackShotHash::equal): * wtf/StackShotProfiler.h: Added. (WTF::StackShotProfiler::StackShotProfiler): (WTF::StackShotProfiler::profile): (WTF::StackShotProfiler::run): Tools: * Scripts/run-jsc-benchmarks: Add splay-latency test, since this change needed to be carefully validated with that benchmark. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/ConcurrentPtrHashSet.cpp: Added. This has unit tests of the new concurrent data structure. The tests focus on correctness under serial execution, which appears to be enough for now (it's so easy to catch a concurrency bug by just running the GC). (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/196360@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225524 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-05 17:53:57 +00:00
if (lock.compareExchangeWeak(currentValue, Hooks::lockHook(currentValue | isHeldBit), std::memory_order_acquire))
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
return true;
}
}
static bool unlockFastAssumingZero(Atomic<LockType>& lock)
{
GC constraint solving should be parallel https://bugs.webkit.org/show_bug.cgi?id=179934 Reviewed by JF Bastien. PerformanceTests: Added a version of splay that measures latency in a way that run-jsc-benchmarks groks. * Octane/splay.js: Added. (this.Setup.setup.setup): (this.TearDown.tearDown.tearDown): (Benchmark): (BenchmarkResult): (BenchmarkResult.prototype.valueOf): (BenchmarkSuite): (alert): (Math.random): (BenchmarkSuite.ResetRNG): (RunStep): (BenchmarkSuite.RunSuites): (BenchmarkSuite.CountBenchmarks): (BenchmarkSuite.GeometricMean): (BenchmarkSuite.GeometricMeanTime): (BenchmarkSuite.AverageAbovePercentile): (BenchmarkSuite.GeometricMeanLatency): (BenchmarkSuite.FormatScore): (BenchmarkSuite.prototype.NotifyStep): (BenchmarkSuite.prototype.NotifyResult): (BenchmarkSuite.prototype.NotifyError): (BenchmarkSuite.prototype.RunSingleBenchmark): (RunNextSetup): (RunNextBenchmark): (RunNextTearDown): (BenchmarkSuite.prototype.RunStep): (GeneratePayloadTree): (GenerateKey): (SplayUpdateStats): (InsertNewNode): (SplaySetup): (SplayTearDown): (SplayRun): (SplayTree): (SplayTree.prototype.isEmpty): (SplayTree.prototype.insert): (SplayTree.prototype.remove): (SplayTree.prototype.find): (SplayTree.prototype.findMax): (SplayTree.prototype.findGreatestLessThan): (SplayTree.prototype.exportKeys): (SplayTree.prototype.splay_): (SplayTree.Node): (SplayTree.Node.prototype.traverse_): (report): (start): Source/JavaScriptCore: This makes it possible to do constraint solving in parallel. This looks like a 1% Speedometer speed-up. It's more than 1% on trunk-Speedometer. The constraint solver supports running constraints in parallel in two different ways: - Run multiple constraints in parallel to each other. This only works for constraints that can tolerate other constraints running concurrently to them (constraint.concurrency() == ConstraintConcurrency::Concurrent). This is the most basic kind of parallelism that the constraint solver supports. All constraints except the JSC SPI constraints are concurrent. We could probably make them concurrent, but I'm playing it safe for now. - A constraint can create parallel work for itself, which the constraint solver will interleave with other stuff. A constraint can report that it has parallel work by returning ConstraintParallelism::Parallel from its executeImpl() function. Then the solver will allow that constraint's doParallelWorkImpl() function to run on as many GC marker threads as are available, for as long as that function wants to run. It's not possible to have a non-concurrent constraint that creates parallel work. The parallelism is implemented in terms of the existing GC marker threads. This turns out to be most natural for two reasons: - No need to start any other threads. - The constraints all want to be passed a SlotVisitor. Running on the marker threads means having access to those threads' SlotVisitors. Also, it means less load balancing. The solver will create work on each marking thread's SlotVisitor. When the solver is done "stealing" a marker thread, that thread will have work it can start doing immediately. Before this change, we had to contribute the work found by the constraint solver to the global worklist so that it could be distributed to the marker threads by load balancing. This change probably helps to avoid that load balancing step. A lot of this change is about making it easy to iterate GC data structures in parallel. This change makes almost all constraints parallel-enabled, but only the DOM's output constraint uses the parallel work API. That constraint iterates the marked cells in two subspaces. This change makes it very easy to compose parallel iterators over subspaces, allocators, blocks, and cells. The marked cell parallel iterator is composed out of parallel iterators for the others. A parallel iterator is just an iterator that can do an atomic next() very quickly. We abstract them using RefPtr<SharedTask<...()>>, where ... is the type returned from the iterator. We know it's done when it returns a falsish version of ... (in the current code, that's always a pointer type, so done is indicated by null). * API/JSMarkingConstraintPrivate.cpp: (JSContextGroupAddMarkingConstraint): * API/JSVirtualMachine.mm: (scanExternalObjectGraph): (scanExternalRememberedSet): * JavaScriptCore.xcodeproj/project.pbxproj: * Sources.txt: * bytecode/AccessCase.cpp: (JSC::AccessCase::propagateTransitions const): * bytecode/CodeBlock.cpp: (JSC::CodeBlock::visitWeakly): (JSC::CodeBlock::shouldJettisonDueToOldAge): (JSC::shouldMarkTransition): (JSC::CodeBlock::propagateTransitions): (JSC::CodeBlock::determineLiveness): * dfg/DFGWorklist.cpp: * ftl/FTLCompile.cpp: (JSC::FTL::compile): * heap/ConstraintParallelism.h: Added. (WTF::printInternal): * heap/Heap.cpp: (JSC::Heap::Heap): (JSC::Heap::addToRememberedSet): (JSC::Heap::runFixpointPhase): (JSC::Heap::stopThePeriphery): (JSC::Heap::resumeThePeriphery): (JSC::Heap::addCoreConstraints): (JSC::Heap::setBonusVisitorTask): (JSC::Heap::runTaskInParallel): (JSC::Heap::forEachSlotVisitor): Deleted. * heap/Heap.h: (JSC::Heap::worldIsRunning const): (JSC::Heap::runFunctionInParallel): * heap/HeapInlines.h: (JSC::Heap::worldIsStopped const): (JSC::Heap::isMarked): (JSC::Heap::incrementDeferralDepth): (JSC::Heap::decrementDeferralDepth): (JSC::Heap::decrementDeferralDepthAndGCIfNeeded): (JSC::Heap::forEachSlotVisitor): (JSC::Heap::collectorBelievesThatTheWorldIsStopped const): Deleted. (JSC::Heap::isMarkedConcurrently): Deleted. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::appendNode): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): Deleted. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::parallelNotEmptyBlockSource): * heap/MarkedAllocator.h: * heap/MarkedBlock.h: (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::areMarksStaleWithDependency): Deleted. (JSC::MarkedBlock::isMarkedConcurrently): Deleted. * heap/MarkedSpace.h: (JSC::MarkedSpace::activeWeakSetsBegin): (JSC::MarkedSpace::activeWeakSetsEnd): (JSC::MarkedSpace::newActiveWeakSetsBegin): (JSC::MarkedSpace::newActiveWeakSetsEnd): * heap/MarkingConstraint.cpp: (JSC::MarkingConstraint::MarkingConstraint): (JSC::MarkingConstraint::execute): (JSC::MarkingConstraint::quickWorkEstimate): (JSC::MarkingConstraint::workEstimate): (JSC::MarkingConstraint::doParallelWork): (JSC::MarkingConstraint::finishParallelWork): (JSC::MarkingConstraint::doParallelWorkImpl): (JSC::MarkingConstraint::finishParallelWorkImpl): * heap/MarkingConstraint.h: (JSC::MarkingConstraint::lastExecuteParallelism const): (JSC::MarkingConstraint::parallelism const): (JSC::MarkingConstraint::quickWorkEstimate): Deleted. (JSC::MarkingConstraint::workEstimate): Deleted. * heap/MarkingConstraintSet.cpp: (JSC::MarkingConstraintSet::MarkingConstraintSet): (JSC::MarkingConstraintSet::add): (JSC::MarkingConstraintSet::executeConvergence): (JSC::MarkingConstraintSet::executeConvergenceImpl): (JSC::MarkingConstraintSet::executeAll): (JSC::MarkingConstraintSet::ExecutionContext::ExecutionContext): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didVisitSomething const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::shouldTimeOut const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::drain): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didExecute const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::execute): Deleted. (): Deleted. * heap/MarkingConstraintSet.h: * heap/MarkingConstraintSolver.cpp: Added. (JSC::MarkingConstraintSolver::MarkingConstraintSolver): (JSC::MarkingConstraintSolver::~MarkingConstraintSolver): (JSC::MarkingConstraintSolver::didVisitSomething const): (JSC::MarkingConstraintSolver::execute): (JSC::MarkingConstraintSolver::drain): (JSC::MarkingConstraintSolver::converge): (JSC::MarkingConstraintSolver::runExecutionThread): (JSC::MarkingConstraintSolver::didExecute): * heap/MarkingConstraintSolver.h: Added. * heap/OpaqueRootSet.h: Removed. * heap/ParallelSourceAdapter.h: Added. (JSC::ParallelSourceAdapter::ParallelSourceAdapter): (JSC::createParallelSourceAdapter): * heap/SimpleMarkingConstraint.cpp: Added. (JSC::SimpleMarkingConstraint::SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::~SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::quickWorkEstimate): (JSC::SimpleMarkingConstraint::executeImpl): * heap/SimpleMarkingConstraint.h: Added. * heap/SlotVisitor.cpp: (JSC::SlotVisitor::didStartMarking): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::updateMutatorIsStopped): (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate const): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::performIncrementOfDraining): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): (JSC::SlotVisitor::waitForTermination): (JSC::SlotVisitor::addOpaqueRoot): Deleted. (JSC::SlotVisitor::containsOpaqueRoot const): Deleted. (JSC::SlotVisitor::containsOpaqueRootTriState const): Deleted. (JSC::SlotVisitor::mergeIfNecessary): Deleted. (JSC::SlotVisitor::mergeOpaqueRootsIfProfitable): Deleted. (JSC::SlotVisitor::mergeOpaqueRoots): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::addOpaqueRoot): (JSC::SlotVisitor::containsOpaqueRoot const): (JSC::SlotVisitor::vm): (JSC::SlotVisitor::vm const): * heap/Subspace.cpp: (JSC::Subspace::parallelAllocatorSource): (JSC::Subspace::parallelNotEmptyMarkedBlockSource): * heap/Subspace.h: * heap/SubspaceInlines.h: (JSC::Subspace::forEachMarkedCellInParallel): * heap/VisitCounter.h: Added. (JSC::VisitCounter::VisitCounter): (JSC::VisitCounter::visitCount const): * heap/VisitingTimeout.h: Removed. * heap/WeakBlock.cpp: (JSC::WeakBlock::specializedVisit): * runtime/Structure.cpp: (JSC::Structure::isCheapDuringGC): (JSC::Structure::markIfCheap): Source/WebCore: No new tests because no change in behavior. This change is best tested using DOM-GC-intensive benchmarks like Speedometer and Dromaeo. This parallelizes the DOM's output constraint, and makes some small changes to make this more scalable. * ForwardingHeaders/heap/SimpleMarkingConstraint.h: Added. * ForwardingHeaders/heap/VisitingTimeout.h: Removed. * Sources.txt: * WebCore.xcodeproj/project.pbxproj: * bindings/js/DOMGCOutputConstraint.cpp: Added. (WebCore::DOMGCOutputConstraint::DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::~DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::executeImpl): (WebCore::DOMGCOutputConstraint::doParallelWorkImpl): (WebCore::DOMGCOutputConstraint::finishParallelWorkImpl): * bindings/js/DOMGCOutputConstraint.h: Added. * bindings/js/WebCoreJSClientData.cpp: (WebCore::JSVMClientData::initNormalWorld): * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): Source/WTF: This does some changes to make it easier to do parallel constraint solving: - I finally removed dependencyWith. This was a silly construct whose only purpose is to confuse people about what it means to have a dependency chain. I took that as an opportunity to grealy simplify the GC's use of dependency chaining. - Added more logic to Deque<>, since I use it for part of the load balancer. - Made it possible to profile lock contention. See https://bugs.webkit.org/show_bug.cgi?id=180250#c0 for some preliminary measurements. - Introduced holdLockIf, which makes it easy to perform predicated lock acquisition. We use that to pick a lock in WebCore. - Introduced CountingLock. It's like WTF::Lock except it also enables optimistic read transactions sorta like Java's StampedLock. * WTF.xcodeproj/project.pbxproj: * wtf/Atomics.h: (WTF::dependency): (WTF::DependencyWith::DependencyWith): Deleted. (WTF::dependencyWith): Deleted. * wtf/BitVector.h: (WTF::BitVector::iterator::operator++): * wtf/CMakeLists.txt: * wtf/ConcurrentPtrHashSet.cpp: Added. (WTF::ConcurrentPtrHashSet::ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::~ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::deleteOldTables): (WTF::ConcurrentPtrHashSet::clear): (WTF::ConcurrentPtrHashSet::initialize): (WTF::ConcurrentPtrHashSet::addSlow): (WTF::ConcurrentPtrHashSet::resizeIfNecessary): (WTF::ConcurrentPtrHashSet::resizeAndAdd): (WTF::ConcurrentPtrHashSet::Table::create): * wtf/ConcurrentPtrHashSet.h: Added. (WTF::ConcurrentPtrHashSet::contains): (WTF::ConcurrentPtrHashSet::add): (WTF::ConcurrentPtrHashSet::size const): (WTF::ConcurrentPtrHashSet::Table::maxLoad const): (WTF::ConcurrentPtrHashSet::hash): (WTF::ConcurrentPtrHashSet::cast): (WTF::ConcurrentPtrHashSet::containsImpl const): (WTF::ConcurrentPtrHashSet::addImpl): * wtf/Deque.h: (WTF::inlineCapacity>::takeFirst): * wtf/FastMalloc.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): * wtf/Locker.h: (WTF::holdLockIf): * wtf/ScopedLambda.h: * wtf/SharedTask.h: (WTF::SharedTask<PassedResultType): (WTF::SharedTask<ResultType): Deleted. * wtf/StackShot.h: Added. (WTF::StackShot::StackShot): (WTF::StackShot::operator=): (WTF::StackShot::array const): (WTF::StackShot::size const): (WTF::StackShot::operator bool const): (WTF::StackShot::operator== const): (WTF::StackShot::hash const): (WTF::StackShot::isHashTableDeletedValue const): (WTF::StackShot::operator> const): (WTF::StackShot::deletedValueArray): (WTF::StackShotHash::hash): (WTF::StackShotHash::equal): * wtf/StackShotProfiler.h: Added. (WTF::StackShotProfiler::StackShotProfiler): (WTF::StackShotProfiler::profile): (WTF::StackShotProfiler::run): Tools: * Scripts/run-jsc-benchmarks: Add splay-latency test, since this change needed to be carefully validated with that benchmark. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/ConcurrentPtrHashSet.cpp: Added. This has unit tests of the new concurrent data structure. The tests focus on correctness under serial execution, which appears to be enough for now (it's so easy to catch a concurrency bug by just running the GC). (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/196360@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225524 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-05 17:53:57 +00:00
return lock.compareExchangeWeak(isHeldBit, Hooks::unlockHook(0), std::memory_order_release);
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
}
static bool unlockFast(Atomic<LockType>& lock)
{
WTF should make it super easy to do ARM concurrency tricks https://bugs.webkit.org/show_bug.cgi?id=169300 Reviewed by Mark Lam. Source/JavaScriptCore: This changes a bunch of GC hot paths to use new concurrency APIs that lead to optimal code on both x86 (fully leverage TSO, transactions become CAS loops) and ARM (use dependency chains for fencing, transactions become LL/SC loops). While inspecting the machine code, I found other opportunities for improvement, like inlining the "am I marked" part of the marking functions. * heap/Heap.cpp: (JSC::Heap::setGCDidJIT): * heap/HeapInlines.h: (JSC::Heap::testAndSetMarked): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): (JSC::LargeAllocation::aboutToMark): (JSC::LargeAllocation::testAndSetMarked): * heap/MarkedBlock.h: (JSC::MarkedBlock::areMarksStaleWithDependency): (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarkedConcurrently): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::testAndSetMarked): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::appendSlow): (JSC::SlotVisitor::appendHiddenSlow): (JSC::SlotVisitor::appendHiddenSlowImpl): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendUnbarriered): Deleted. (JSC::SlotVisitor::appendHidden): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::appendUnbarriered): (JSC::SlotVisitor::appendHidden): (JSC::SlotVisitor::append): (JSC::SlotVisitor::appendValues): (JSC::SlotVisitor::appendValuesHidden): * runtime/CustomGetterSetter.cpp: * runtime/JSObject.cpp: (JSC::JSObject::visitButterflyImpl): * runtime/JSObject.h: Source/WTF: This adds Atomic<>::loadLink and Atomic<>::storeCond, available only when HAVE(LL_SC). It abstracts loadLink/storeCond behind prepare/attempt. You can write prepare/attempt loops whenever your loop fits into the least common denominator of LL/SC and CAS. This modifies Atomic<>::transaction to use prepare/attempt. So, if you write your loop using Atomic<>::transaction, then you get LL/SC for free. Depending on the kind of transaction you are doing, you may not want to perform an LL until you have a chance to just load the current value. Atomic<>::transaction() assumes that you do not care to have any ordering guarantees in that case. If you think that the transaction has a good chance of aborting this way, you want Atomic<>::transaction() to first do a plain load. But if you don't think that such an abort is likely, then you want to go straight to the LL. The API supports this concept via TransactionAbortLikelihood. Additionally, this redoes the depend/consume API to be dead simple. Dependency is unsigned. You get a dependency on a loaded value by just saying dependency(loadedValue). You consume the dependency by using it as a bonus index to some pointer dereference. This is made easy with the consume<T*>(ptr, dependency) helper. In those cases where you want to pass around both a computed value and a dependency, there's DependencyWith<T>. But you won't need it in most cases. The loaded value or any value computed from the loaded value is a fine input to dependency()! This change updates a bunch of hot paths to use the new APIs. Using transaction() gives us optimal LL/SC loops for object marking and lock acquisition. This change also updates a bunch of hot paths to use dependency()/consume(). This is a significant Octane/splay speed-up on ARM. * wtf/Atomics.h: (WTF::hasFence): (WTF::Atomic::prepare): (WTF::Atomic::attempt): (WTF::Atomic::transaction): (WTF::Atomic::transactionRelaxed): (WTF::nullDependency): (WTF::dependency): (WTF::DependencyWith::DependencyWith): (WTF::dependencyWith): (WTF::consume): (WTF::Atomic::tryTransactionRelaxed): Deleted. (WTF::Atomic::tryTransaction): Deleted. (WTF::zeroWithConsumeDependency): Deleted. (WTF::consumeLoad): Deleted. * wtf/Bitmap.h: (WTF::WordType>::get): (WTF::WordType>::concurrentTestAndSet): (WTF::WordType>::concurrentTestAndClear): * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlockSlow): * wtf/Platform.h: Tools: This vastly simplifies the consume API. The new API is thoroughly tested by being used in the GC's guts. I think that unit tests are a pain to maintain, so we shouldn't have them unless we are legitimately worried about coverage. We're not in this case. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Consume.cpp: Removed. Canonical link: https://commits.webkit.org/186402@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@213645 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-03-09 17:40:10 +00:00
return lock.transaction(
[&] (LockType& value) -> bool {
if ((value & mask) != isHeldBit)
return false;
value &= ~isHeldBit;
GC constraint solving should be parallel https://bugs.webkit.org/show_bug.cgi?id=179934 Reviewed by JF Bastien. PerformanceTests: Added a version of splay that measures latency in a way that run-jsc-benchmarks groks. * Octane/splay.js: Added. (this.Setup.setup.setup): (this.TearDown.tearDown.tearDown): (Benchmark): (BenchmarkResult): (BenchmarkResult.prototype.valueOf): (BenchmarkSuite): (alert): (Math.random): (BenchmarkSuite.ResetRNG): (RunStep): (BenchmarkSuite.RunSuites): (BenchmarkSuite.CountBenchmarks): (BenchmarkSuite.GeometricMean): (BenchmarkSuite.GeometricMeanTime): (BenchmarkSuite.AverageAbovePercentile): (BenchmarkSuite.GeometricMeanLatency): (BenchmarkSuite.FormatScore): (BenchmarkSuite.prototype.NotifyStep): (BenchmarkSuite.prototype.NotifyResult): (BenchmarkSuite.prototype.NotifyError): (BenchmarkSuite.prototype.RunSingleBenchmark): (RunNextSetup): (RunNextBenchmark): (RunNextTearDown): (BenchmarkSuite.prototype.RunStep): (GeneratePayloadTree): (GenerateKey): (SplayUpdateStats): (InsertNewNode): (SplaySetup): (SplayTearDown): (SplayRun): (SplayTree): (SplayTree.prototype.isEmpty): (SplayTree.prototype.insert): (SplayTree.prototype.remove): (SplayTree.prototype.find): (SplayTree.prototype.findMax): (SplayTree.prototype.findGreatestLessThan): (SplayTree.prototype.exportKeys): (SplayTree.prototype.splay_): (SplayTree.Node): (SplayTree.Node.prototype.traverse_): (report): (start): Source/JavaScriptCore: This makes it possible to do constraint solving in parallel. This looks like a 1% Speedometer speed-up. It's more than 1% on trunk-Speedometer. The constraint solver supports running constraints in parallel in two different ways: - Run multiple constraints in parallel to each other. This only works for constraints that can tolerate other constraints running concurrently to them (constraint.concurrency() == ConstraintConcurrency::Concurrent). This is the most basic kind of parallelism that the constraint solver supports. All constraints except the JSC SPI constraints are concurrent. We could probably make them concurrent, but I'm playing it safe for now. - A constraint can create parallel work for itself, which the constraint solver will interleave with other stuff. A constraint can report that it has parallel work by returning ConstraintParallelism::Parallel from its executeImpl() function. Then the solver will allow that constraint's doParallelWorkImpl() function to run on as many GC marker threads as are available, for as long as that function wants to run. It's not possible to have a non-concurrent constraint that creates parallel work. The parallelism is implemented in terms of the existing GC marker threads. This turns out to be most natural for two reasons: - No need to start any other threads. - The constraints all want to be passed a SlotVisitor. Running on the marker threads means having access to those threads' SlotVisitors. Also, it means less load balancing. The solver will create work on each marking thread's SlotVisitor. When the solver is done "stealing" a marker thread, that thread will have work it can start doing immediately. Before this change, we had to contribute the work found by the constraint solver to the global worklist so that it could be distributed to the marker threads by load balancing. This change probably helps to avoid that load balancing step. A lot of this change is about making it easy to iterate GC data structures in parallel. This change makes almost all constraints parallel-enabled, but only the DOM's output constraint uses the parallel work API. That constraint iterates the marked cells in two subspaces. This change makes it very easy to compose parallel iterators over subspaces, allocators, blocks, and cells. The marked cell parallel iterator is composed out of parallel iterators for the others. A parallel iterator is just an iterator that can do an atomic next() very quickly. We abstract them using RefPtr<SharedTask<...()>>, where ... is the type returned from the iterator. We know it's done when it returns a falsish version of ... (in the current code, that's always a pointer type, so done is indicated by null). * API/JSMarkingConstraintPrivate.cpp: (JSContextGroupAddMarkingConstraint): * API/JSVirtualMachine.mm: (scanExternalObjectGraph): (scanExternalRememberedSet): * JavaScriptCore.xcodeproj/project.pbxproj: * Sources.txt: * bytecode/AccessCase.cpp: (JSC::AccessCase::propagateTransitions const): * bytecode/CodeBlock.cpp: (JSC::CodeBlock::visitWeakly): (JSC::CodeBlock::shouldJettisonDueToOldAge): (JSC::shouldMarkTransition): (JSC::CodeBlock::propagateTransitions): (JSC::CodeBlock::determineLiveness): * dfg/DFGWorklist.cpp: * ftl/FTLCompile.cpp: (JSC::FTL::compile): * heap/ConstraintParallelism.h: Added. (WTF::printInternal): * heap/Heap.cpp: (JSC::Heap::Heap): (JSC::Heap::addToRememberedSet): (JSC::Heap::runFixpointPhase): (JSC::Heap::stopThePeriphery): (JSC::Heap::resumeThePeriphery): (JSC::Heap::addCoreConstraints): (JSC::Heap::setBonusVisitorTask): (JSC::Heap::runTaskInParallel): (JSC::Heap::forEachSlotVisitor): Deleted. * heap/Heap.h: (JSC::Heap::worldIsRunning const): (JSC::Heap::runFunctionInParallel): * heap/HeapInlines.h: (JSC::Heap::worldIsStopped const): (JSC::Heap::isMarked): (JSC::Heap::incrementDeferralDepth): (JSC::Heap::decrementDeferralDepth): (JSC::Heap::decrementDeferralDepthAndGCIfNeeded): (JSC::Heap::forEachSlotVisitor): (JSC::Heap::collectorBelievesThatTheWorldIsStopped const): Deleted. (JSC::Heap::isMarkedConcurrently): Deleted. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::appendNode): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): Deleted. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::parallelNotEmptyBlockSource): * heap/MarkedAllocator.h: * heap/MarkedBlock.h: (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::areMarksStaleWithDependency): Deleted. (JSC::MarkedBlock::isMarkedConcurrently): Deleted. * heap/MarkedSpace.h: (JSC::MarkedSpace::activeWeakSetsBegin): (JSC::MarkedSpace::activeWeakSetsEnd): (JSC::MarkedSpace::newActiveWeakSetsBegin): (JSC::MarkedSpace::newActiveWeakSetsEnd): * heap/MarkingConstraint.cpp: (JSC::MarkingConstraint::MarkingConstraint): (JSC::MarkingConstraint::execute): (JSC::MarkingConstraint::quickWorkEstimate): (JSC::MarkingConstraint::workEstimate): (JSC::MarkingConstraint::doParallelWork): (JSC::MarkingConstraint::finishParallelWork): (JSC::MarkingConstraint::doParallelWorkImpl): (JSC::MarkingConstraint::finishParallelWorkImpl): * heap/MarkingConstraint.h: (JSC::MarkingConstraint::lastExecuteParallelism const): (JSC::MarkingConstraint::parallelism const): (JSC::MarkingConstraint::quickWorkEstimate): Deleted. (JSC::MarkingConstraint::workEstimate): Deleted. * heap/MarkingConstraintSet.cpp: (JSC::MarkingConstraintSet::MarkingConstraintSet): (JSC::MarkingConstraintSet::add): (JSC::MarkingConstraintSet::executeConvergence): (JSC::MarkingConstraintSet::executeConvergenceImpl): (JSC::MarkingConstraintSet::executeAll): (JSC::MarkingConstraintSet::ExecutionContext::ExecutionContext): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didVisitSomething const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::shouldTimeOut const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::drain): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::didExecute const): Deleted. (JSC::MarkingConstraintSet::ExecutionContext::execute): Deleted. (): Deleted. * heap/MarkingConstraintSet.h: * heap/MarkingConstraintSolver.cpp: Added. (JSC::MarkingConstraintSolver::MarkingConstraintSolver): (JSC::MarkingConstraintSolver::~MarkingConstraintSolver): (JSC::MarkingConstraintSolver::didVisitSomething const): (JSC::MarkingConstraintSolver::execute): (JSC::MarkingConstraintSolver::drain): (JSC::MarkingConstraintSolver::converge): (JSC::MarkingConstraintSolver::runExecutionThread): (JSC::MarkingConstraintSolver::didExecute): * heap/MarkingConstraintSolver.h: Added. * heap/OpaqueRootSet.h: Removed. * heap/ParallelSourceAdapter.h: Added. (JSC::ParallelSourceAdapter::ParallelSourceAdapter): (JSC::createParallelSourceAdapter): * heap/SimpleMarkingConstraint.cpp: Added. (JSC::SimpleMarkingConstraint::SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::~SimpleMarkingConstraint): (JSC::SimpleMarkingConstraint::quickWorkEstimate): (JSC::SimpleMarkingConstraint::executeImpl): * heap/SimpleMarkingConstraint.h: Added. * heap/SlotVisitor.cpp: (JSC::SlotVisitor::didStartMarking): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::updateMutatorIsStopped): (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate const): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::performIncrementOfDraining): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): (JSC::SlotVisitor::waitForTermination): (JSC::SlotVisitor::addOpaqueRoot): Deleted. (JSC::SlotVisitor::containsOpaqueRoot const): Deleted. (JSC::SlotVisitor::containsOpaqueRootTriState const): Deleted. (JSC::SlotVisitor::mergeIfNecessary): Deleted. (JSC::SlotVisitor::mergeOpaqueRootsIfProfitable): Deleted. (JSC::SlotVisitor::mergeOpaqueRoots): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::addOpaqueRoot): (JSC::SlotVisitor::containsOpaqueRoot const): (JSC::SlotVisitor::vm): (JSC::SlotVisitor::vm const): * heap/Subspace.cpp: (JSC::Subspace::parallelAllocatorSource): (JSC::Subspace::parallelNotEmptyMarkedBlockSource): * heap/Subspace.h: * heap/SubspaceInlines.h: (JSC::Subspace::forEachMarkedCellInParallel): * heap/VisitCounter.h: Added. (JSC::VisitCounter::VisitCounter): (JSC::VisitCounter::visitCount const): * heap/VisitingTimeout.h: Removed. * heap/WeakBlock.cpp: (JSC::WeakBlock::specializedVisit): * runtime/Structure.cpp: (JSC::Structure::isCheapDuringGC): (JSC::Structure::markIfCheap): Source/WebCore: No new tests because no change in behavior. This change is best tested using DOM-GC-intensive benchmarks like Speedometer and Dromaeo. This parallelizes the DOM's output constraint, and makes some small changes to make this more scalable. * ForwardingHeaders/heap/SimpleMarkingConstraint.h: Added. * ForwardingHeaders/heap/VisitingTimeout.h: Removed. * Sources.txt: * WebCore.xcodeproj/project.pbxproj: * bindings/js/DOMGCOutputConstraint.cpp: Added. (WebCore::DOMGCOutputConstraint::DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::~DOMGCOutputConstraint): (WebCore::DOMGCOutputConstraint::executeImpl): (WebCore::DOMGCOutputConstraint::doParallelWorkImpl): (WebCore::DOMGCOutputConstraint::finishParallelWorkImpl): * bindings/js/DOMGCOutputConstraint.h: Added. * bindings/js/WebCoreJSClientData.cpp: (WebCore::JSVMClientData::initNormalWorld): * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): Source/WTF: This does some changes to make it easier to do parallel constraint solving: - I finally removed dependencyWith. This was a silly construct whose only purpose is to confuse people about what it means to have a dependency chain. I took that as an opportunity to grealy simplify the GC's use of dependency chaining. - Added more logic to Deque<>, since I use it for part of the load balancer. - Made it possible to profile lock contention. See https://bugs.webkit.org/show_bug.cgi?id=180250#c0 for some preliminary measurements. - Introduced holdLockIf, which makes it easy to perform predicated lock acquisition. We use that to pick a lock in WebCore. - Introduced CountingLock. It's like WTF::Lock except it also enables optimistic read transactions sorta like Java's StampedLock. * WTF.xcodeproj/project.pbxproj: * wtf/Atomics.h: (WTF::dependency): (WTF::DependencyWith::DependencyWith): Deleted. (WTF::dependencyWith): Deleted. * wtf/BitVector.h: (WTF::BitVector::iterator::operator++): * wtf/CMakeLists.txt: * wtf/ConcurrentPtrHashSet.cpp: Added. (WTF::ConcurrentPtrHashSet::ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::~ConcurrentPtrHashSet): (WTF::ConcurrentPtrHashSet::deleteOldTables): (WTF::ConcurrentPtrHashSet::clear): (WTF::ConcurrentPtrHashSet::initialize): (WTF::ConcurrentPtrHashSet::addSlow): (WTF::ConcurrentPtrHashSet::resizeIfNecessary): (WTF::ConcurrentPtrHashSet::resizeAndAdd): (WTF::ConcurrentPtrHashSet::Table::create): * wtf/ConcurrentPtrHashSet.h: Added. (WTF::ConcurrentPtrHashSet::contains): (WTF::ConcurrentPtrHashSet::add): (WTF::ConcurrentPtrHashSet::size const): (WTF::ConcurrentPtrHashSet::Table::maxLoad const): (WTF::ConcurrentPtrHashSet::hash): (WTF::ConcurrentPtrHashSet::cast): (WTF::ConcurrentPtrHashSet::containsImpl const): (WTF::ConcurrentPtrHashSet::addImpl): * wtf/Deque.h: (WTF::inlineCapacity>::takeFirst): * wtf/FastMalloc.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): * wtf/Locker.h: (WTF::holdLockIf): * wtf/ScopedLambda.h: * wtf/SharedTask.h: (WTF::SharedTask<PassedResultType): (WTF::SharedTask<ResultType): Deleted. * wtf/StackShot.h: Added. (WTF::StackShot::StackShot): (WTF::StackShot::operator=): (WTF::StackShot::array const): (WTF::StackShot::size const): (WTF::StackShot::operator bool const): (WTF::StackShot::operator== const): (WTF::StackShot::hash const): (WTF::StackShot::isHashTableDeletedValue const): (WTF::StackShot::operator> const): (WTF::StackShot::deletedValueArray): (WTF::StackShotHash::hash): (WTF::StackShotHash::equal): * wtf/StackShotProfiler.h: Added. (WTF::StackShotProfiler::StackShotProfiler): (WTF::StackShotProfiler::profile): (WTF::StackShotProfiler::run): Tools: * Scripts/run-jsc-benchmarks: Add splay-latency test, since this change needed to be carefully validated with that benchmark. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/ConcurrentPtrHashSet.cpp: Added. This has unit tests of the new concurrent data structure. The tests focus on correctness under serial execution, which appears to be enough for now (it's so easy to catch a concurrency bug by just running the GC). (TestWebKitAPI::TEST): Canonical link: https://commits.webkit.org/196360@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@225524 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-12-05 17:53:57 +00:00
value = Hooks::unlockHook(value);
WTF should make it super easy to do ARM concurrency tricks https://bugs.webkit.org/show_bug.cgi?id=169300 Reviewed by Mark Lam. Source/JavaScriptCore: This changes a bunch of GC hot paths to use new concurrency APIs that lead to optimal code on both x86 (fully leverage TSO, transactions become CAS loops) and ARM (use dependency chains for fencing, transactions become LL/SC loops). While inspecting the machine code, I found other opportunities for improvement, like inlining the "am I marked" part of the marking functions. * heap/Heap.cpp: (JSC::Heap::setGCDidJIT): * heap/HeapInlines.h: (JSC::Heap::testAndSetMarked): * heap/LargeAllocation.h: (JSC::LargeAllocation::isMarked): (JSC::LargeAllocation::isMarkedConcurrently): (JSC::LargeAllocation::aboutToMark): (JSC::LargeAllocation::testAndSetMarked): * heap/MarkedBlock.h: (JSC::MarkedBlock::areMarksStaleWithDependency): (JSC::MarkedBlock::aboutToMark): (JSC::MarkedBlock::isMarkedConcurrently): (JSC::MarkedBlock::isMarked): (JSC::MarkedBlock::testAndSetMarked): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::appendSlow): (JSC::SlotVisitor::appendHiddenSlow): (JSC::SlotVisitor::appendHiddenSlowImpl): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendUnbarriered): Deleted. (JSC::SlotVisitor::appendHidden): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::appendUnbarriered): (JSC::SlotVisitor::appendHidden): (JSC::SlotVisitor::append): (JSC::SlotVisitor::appendValues): (JSC::SlotVisitor::appendValuesHidden): * runtime/CustomGetterSetter.cpp: * runtime/JSObject.cpp: (JSC::JSObject::visitButterflyImpl): * runtime/JSObject.h: Source/WTF: This adds Atomic<>::loadLink and Atomic<>::storeCond, available only when HAVE(LL_SC). It abstracts loadLink/storeCond behind prepare/attempt. You can write prepare/attempt loops whenever your loop fits into the least common denominator of LL/SC and CAS. This modifies Atomic<>::transaction to use prepare/attempt. So, if you write your loop using Atomic<>::transaction, then you get LL/SC for free. Depending on the kind of transaction you are doing, you may not want to perform an LL until you have a chance to just load the current value. Atomic<>::transaction() assumes that you do not care to have any ordering guarantees in that case. If you think that the transaction has a good chance of aborting this way, you want Atomic<>::transaction() to first do a plain load. But if you don't think that such an abort is likely, then you want to go straight to the LL. The API supports this concept via TransactionAbortLikelihood. Additionally, this redoes the depend/consume API to be dead simple. Dependency is unsigned. You get a dependency on a loaded value by just saying dependency(loadedValue). You consume the dependency by using it as a bonus index to some pointer dereference. This is made easy with the consume<T*>(ptr, dependency) helper. In those cases where you want to pass around both a computed value and a dependency, there's DependencyWith<T>. But you won't need it in most cases. The loaded value or any value computed from the loaded value is a fine input to dependency()! This change updates a bunch of hot paths to use the new APIs. Using transaction() gives us optimal LL/SC loops for object marking and lock acquisition. This change also updates a bunch of hot paths to use dependency()/consume(). This is a significant Octane/splay speed-up on ARM. * wtf/Atomics.h: (WTF::hasFence): (WTF::Atomic::prepare): (WTF::Atomic::attempt): (WTF::Atomic::transaction): (WTF::Atomic::transactionRelaxed): (WTF::nullDependency): (WTF::dependency): (WTF::DependencyWith::DependencyWith): (WTF::dependencyWith): (WTF::consume): (WTF::Atomic::tryTransactionRelaxed): Deleted. (WTF::Atomic::tryTransaction): Deleted. (WTF::zeroWithConsumeDependency): Deleted. (WTF::consumeLoad): Deleted. * wtf/Bitmap.h: (WTF::WordType>::get): (WTF::WordType>::concurrentTestAndSet): (WTF::WordType>::concurrentTestAndClear): * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlockSlow): * wtf/Platform.h: Tools: This vastly simplifies the consume API. The new API is thoroughly tested by being used in the GC's guts. I think that unit tests are a pain to maintain, so we shouldn't have them unless we are legitimately worried about coverage. We're not in this case. * TestWebKitAPI/CMakeLists.txt: * TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj: * TestWebKitAPI/Tests/WTF/Consume.cpp: Removed. Canonical link: https://commits.webkit.org/186402@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@213645 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-03-09 17:40:10 +00:00
return true;
},
std::memory_order_relaxed);
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
}
static void unlock(Atomic<LockType>& lock)
{
if (UNLIKELY(!unlockFast(lock)))
unlockSlow(lock, Unfair);
}
static void unlockFairly(Atomic<LockType>& lock)
{
if (UNLIKELY(!unlockFast(lock)))
unlockSlow(lock, Fair);
}
PerformanceTests: Concurrent GC should be stable enough to land enabled https://bugs.webkit.org/show_bug.cgi?id=164990 Reviewed by Geoffrey Garen. Made CDjs more configurable and refined the "large.js" configuration. I was using that one and the new "long.js" configuration to tune concurrent eden GCs. Added a new way of running Splay in browser, which using chartjs to plot the execution times of 2000 iterations. This includes the minified chartjs. * JetStream/Octane2/splay-detail.html: Added. * JetStream/cdjs/benchmark.js: (benchmarkImpl): (benchmark): * JetStream/cdjs/long.js: Added. Source/JavaScriptCore: Concurrent GC should be stable enough to land enabled on X86_64 https://bugs.webkit.org/show_bug.cgi?id=164990 Reviewed by Geoffrey Garen. This fixes a ton of performance and correctness bugs revealed by getting the concurrent GC to be stable enough to land enabled. I had to redo the JSObject::visitChildren concurrency protocol again. This time I think it's even more correct than ever! This is an enormous win on JetStream/splay-latency and Octane/SplayLatency. It looks to be mostly neutral on everything else, though Speedometer is showing statistically weak signs of a slight regression. * API/JSAPIWrapperObject.mm: Added locking. (JSC::JSAPIWrapperObject::visitChildren): * API/JSCallbackObject.h: Added locking. (JSC::JSCallbackObjectData::visitChildren): (JSC::JSCallbackObjectData::JSPrivatePropertyMap::setPrivateProperty): (JSC::JSCallbackObjectData::JSPrivatePropertyMap::deletePrivateProperty): (JSC::JSCallbackObjectData::JSPrivatePropertyMap::visitChildren): * CMakeLists.txt: * JavaScriptCore.xcodeproj/project.pbxproj: * bytecode/CodeBlock.cpp: (JSC::CodeBlock::UnconditionalFinalizer::finalizeUnconditionally): This had a TOCTOU race on shouldJettisonDueToOldAge. (JSC::EvalCodeCache::visitAggregate): Moved to EvalCodeCache.cpp. * bytecode/DirectEvalCodeCache.cpp: Added. Outlined some functions and made them use locks. (JSC::DirectEvalCodeCache::setSlow): (JSC::DirectEvalCodeCache::clear): (JSC::DirectEvalCodeCache::visitAggregate): * bytecode/DirectEvalCodeCache.h: (JSC::DirectEvalCodeCache::set): (JSC::DirectEvalCodeCache::clear): Deleted. * bytecode/UnlinkedCodeBlock.cpp: Added locking. (JSC::UnlinkedCodeBlock::visitChildren): (JSC::UnlinkedCodeBlock::setInstructions): (JSC::UnlinkedCodeBlock::shrinkToFit): * bytecode/UnlinkedCodeBlock.h: Added locking. (JSC::UnlinkedCodeBlock::addRegExp): (JSC::UnlinkedCodeBlock::addConstant): (JSC::UnlinkedCodeBlock::addFunctionDecl): (JSC::UnlinkedCodeBlock::addFunctionExpr): (JSC::UnlinkedCodeBlock::createRareDataIfNecessary): (JSC::UnlinkedCodeBlock::shrinkToFit): Deleted. * debugger/Debugger.cpp: Use the right delete API. (JSC::Debugger::recompileAllJSFunctions): * dfg/DFGAbstractInterpreterInlines.h: (JSC::DFG::AbstractInterpreter<AbstractStateType>::executeEffects): Fix a pre-existing bug in ToFunction constant folding. * dfg/DFGClobberize.h: Add support for nuking. (JSC::DFG::clobberize): * dfg/DFGClobbersExitState.cpp: Add support for nuking. (JSC::DFG::clobbersExitState): * dfg/DFGFixupPhase.cpp: Add support for nuking. (JSC::DFG::FixupPhase::fixupNode): (JSC::DFG::FixupPhase::indexForChecks): (JSC::DFG::FixupPhase::originForCheck): (JSC::DFG::FixupPhase::speculateForBarrier): (JSC::DFG::FixupPhase::insertCheck): (JSC::DFG::FixupPhase::fixupChecksInBlock): * dfg/DFGSpeculativeJIT.cpp: Add support for nuking. (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): * ftl/FTLLowerDFGToB3.cpp: Add support for nuking. (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::nukeStructureAndSetButterfly): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): Deleted. * heap/CodeBlockSet.cpp: We need to be more careful about the CodeBlockSet workflow during GC, since we will allocate CodeBlocks in eden while collecting. (JSC::CodeBlockSet::clearMarksForFullCollection): (JSC::CodeBlockSet::deleteUnmarkedAndUnreferenced): * heap/Heap.cpp: Added code to measure max pauses. Added a better collectContinuously mode. (JSC::Heap::lastChanceToFinalize): Stop the collectContinuously thread. (JSC::Heap::harvestWeakReferences): Inline SlotVisitor::harvestWeakReferences. (JSC::Heap::finalizeUnconditionalFinalizers): Inline SlotVisitor::finalizeUnconditionalReferences. (JSC::Heap::markToFixpoint): We need to do some MarkedSpace stuff before every conservative scan, rather than just at the start of marking, so we now call prepareForConservativeScan() before each conservative scan. Also call a less-parallel version of drainInParallel when the mutator is running. (JSC::Heap::collectInThread): Inline Heap::prepareForAllocation(). (JSC::Heap::stopIfNecessarySlow): We need to be more careful about ensuring that we run finalization before and after stopping. Also, we should sanitize stack when stopping the world. (JSC::Heap::acquireAccessSlow): Add some optional debug prints. (JSC::Heap::handleNeedFinalize): Assert that we are running this when the world is not stopped. (JSC::Heap::finalize): Remove the old collectContinuously code. (JSC::Heap::requestCollection): We don't need to sanitize stack here anymore. (JSC::Heap::notifyIsSafeToCollect): Start the collectContinuously thread. It will request collection 1 KHz. (JSC::Heap::prepareForAllocation): Deleted. (JSC::Heap::preventCollection): Prevent any new concurrent GCs from being initiated. (JSC::Heap::allowCollection): (JSC::Heap::forEachSlotVisitor): Allows us to safely iterate slot visitors. * heap/Heap.h: * heap/HeapInlines.h: (JSC::Heap::writeBarrier): If the 'to' cell is not NewWhite then it could be AnthraciteOrBlack. During a full collection, objects may be AnthraciteOrBlack from a previous GC. Turns out, we don't benefit from this optimization so we can just kill it. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::buildSnapshot): This needs to use PreventCollectionScope to ensure snapshot soundness. * heap/ListableHandler.h: (JSC::ListableHandler::isOnList): Useful helper. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): It's a locker that only locks while we're marking. * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::addBlock): Hold the bitvector lock while resizing. * heap/MarkedBlock.cpp: Hold the bitvector lock while accessing the bitvectors while the mutator is running. * heap/MarkedSpace.cpp: (JSC::MarkedSpace::prepareForConservativeScan): We used to do this in prepareForMarking, but we need to do it before each conservative scan not just before marking. (JSC::MarkedSpace::prepareForMarking): Remove the logic moved to prepareForConservativeScan. * heap/MarkedSpace.h: * heap/PreventCollectionScope.h: Added. * heap/SlotVisitor.cpp: Refactored drainFromShared so that we can write a similar function called drainInParallelPassively. (JSC::SlotVisitor::updateMutatorIsStopped): Update whether we can use "fast" scanning. (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drain): This now uses the rightToRun lock to allow the main GC thread to safepoint the workers. (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): This runs marking with one fewer threads than normal. It's useful for when we have resumed the mutator, since then the mutator has a better chance of getting on a core. (JSC::SlotVisitor::addWeakReferenceHarvester): (JSC::SlotVisitor::addUnconditionalFinalizer): (JSC::SlotVisitor::harvestWeakReferences): Deleted. (JSC::SlotVisitor::finalizeUnconditionalFinalizers): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: Outline stuff. (JSC::SlotVisitor::addWeakReferenceHarvester): Deleted. (JSC::SlotVisitor::addUnconditionalFinalizer): Deleted. * runtime/InferredType.cpp: This needed thread safety. (JSC::InferredType::visitChildren): This needs to keep its structure finalizer alive until it runs. (JSC::InferredType::set): (JSC::InferredType::InferredStructureFinalizer::finalizeUnconditionally): * runtime/InferredType.h: * runtime/InferredValue.cpp: This needed thread safety. (JSC::InferredValue::visitChildren): (JSC::InferredValue::ValueCleanup::finalizeUnconditionally): * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): Update to use new butterfly API. (JSC::JSArray::unshiftCountWithArrayStorage): Update to use new butterfly API. * runtime/JSArrayBufferView.cpp: (JSC::JSArrayBufferView::visitChildren): Thread safety. * runtime/JSCell.h: (JSC::JSCell::setStructureIDDirectly): This is used for nuking the structure. (JSC::JSCell::InternalLocker::InternalLocker): Deleted. The cell is now the lock. (JSC::JSCell::InternalLocker::~InternalLocker): Deleted. The cell is now the lock. * runtime/JSCellInlines.h: (JSC::JSCell::structure): Clean this up. (JSC::JSCell::lock): The cell is now the lock. (JSC::JSCell::tryLock): (JSC::JSCell::unlock): (JSC::JSCell::isLocked): (JSC::JSCell::lockInternalLock): Deleted. (JSC::JSCell::unlockInternalLock): Deleted. * runtime/JSFunction.cpp: (JSC::JSFunction::visitChildren): Thread safety. * runtime/JSGenericTypedArrayViewInlines.h: (JSC::JSGenericTypedArrayView<Adaptor>::visitChildren): Thread safety. (JSC::JSGenericTypedArrayView<Adaptor>::slowDownAndWasteMemory): Thread safety. * runtime/JSObject.cpp: (JSC::JSObject::markAuxiliaryAndVisitOutOfLineProperties): Factor out this "easy" step of butterfly visiting. (JSC::JSObject::visitButterfly): Make this achieve 100% precision about structure-butterfly relationships. This relies on the mutator "nuking" the structure prior to "locked" structure-butterfly transitions. (JSC::JSObject::visitChildren): Use the new, nicer API. (JSC::JSFinalObject::visitChildren): Use the new, nicer API. (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): Use the new butterfly API. (JSC::JSObject::createInitialUndecided): Use the new butterfly API. (JSC::JSObject::createInitialInt32): Use the new butterfly API. (JSC::JSObject::createInitialDouble): Use the new butterfly API. (JSC::JSObject::createInitialContiguous): Use the new butterfly API. (JSC::JSObject::createArrayStorage): Use the new butterfly API. (JSC::JSObject::convertUndecidedToContiguous): Use the new butterfly API. (JSC::JSObject::convertUndecidedToArrayStorage): Use the new butterfly API. (JSC::JSObject::convertInt32ToArrayStorage): Use the new butterfly API. (JSC::JSObject::convertDoubleToContiguous): Use the new butterfly API. (JSC::JSObject::convertDoubleToArrayStorage): Use the new butterfly API. (JSC::JSObject::convertContiguousToArrayStorage): Use the new butterfly API. (JSC::JSObject::increaseVectorLength): Use the new butterfly API. (JSC::JSObject::shiftButterflyAfterFlattening): Use the new butterfly API. * runtime/JSObject.h: (JSC::JSObject::setButterfly): This now does all of the fences. Only use this when you are not also transitioning the structure or the structure's lastOffset. (JSC::JSObject::nukeStructureAndSetButterfly): Use this when doing locked structure-butterfly transitions. * runtime/JSObjectInlines.h: (JSC::JSObject::putDirectWithoutTransition): Use the newly factored out API. (JSC::JSObject::prepareToPutDirectWithoutTransition): Factor this out! (JSC::JSObject::putDirectInternal): Use the newly factored out API. * runtime/JSPropertyNameEnumerator.cpp: (JSC::JSPropertyNameEnumerator::finishCreation): Locks! (JSC::JSPropertyNameEnumerator::visitChildren): Locks! * runtime/JSSegmentedVariableObject.cpp: (JSC::JSSegmentedVariableObject::visitChildren): Locks! * runtime/JSString.cpp: (JSC::JSString::visitChildren): Thread safety. * runtime/ModuleProgramExecutable.cpp: (JSC::ModuleProgramExecutable::visitChildren): Thread safety. * runtime/Options.cpp: For now we disable concurrent GC on not-X86_64. (JSC::recomputeDependentOptions): * runtime/Options.h: Change the default max GC parallelism to 8. I don't know why it was still 7. * runtime/SamplingProfiler.cpp: (JSC::SamplingProfiler::stackTracesAsJSON): This needs to defer GC before grabbing its lock. * runtime/SparseArrayValueMap.cpp: This needed thread safety. (JSC::SparseArrayValueMap::add): (JSC::SparseArrayValueMap::remove): (JSC::SparseArrayValueMap::visitChildren): * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: This had a race between addNewPropertyTransition and visitChildren. (JSC::Structure::Structure): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::add): Help out with nuking support - the m_offset needs to play along. (JSC::Structure::visitChildren): * runtime/Structure.h: Make some useful things public - like the notion of a lastOffset. * runtime/StructureChain.cpp: (JSC::StructureChain::visitChildren): Thread safety! * runtime/StructureChain.h: Thread safety! * runtime/StructureIDTable.cpp: (JSC::StructureIDTable::allocateID): Ensure that we don't get nuked IDs. * runtime/StructureIDTable.h: Add the notion of a nuked ID! It's a bit that the runtime never sees except during specific shady actions like locked structure-butterfly transitions. "Nuking" tells the GC to steer clear and rescan once we fire the barrier. (JSC::nukedStructureIDBit): (JSC::nuke): (JSC::isNuked): (JSC::decontaminate): * runtime/StructureInlines.h: (JSC::Structure::hasIndexingHeader): Better API. (JSC::Structure::add): * runtime/VM.cpp: Better GC interaction. (JSC::VM::ensureWatchdog): (JSC::VM::deleteAllLinkedCode): (JSC::VM::deleteAllCode): * runtime/VM.h: (JSC::VM::getStructure): Why wasn't this always an API! * runtime/WebAssemblyExecutable.cpp: (JSC::WebAssemblyExecutable::visitChildren): Thread safety. Source/WebCore: Concurrent GC should be stable enough to land enabled on X86_64 https://bugs.webkit.org/show_bug.cgi?id=164990 Reviewed by Geoffrey Garen. Made WebCore down with concurrent marking by adding some locking and adapting to some new API. This has new test modes in run-sjc-stress-tests. Also, the way that LayoutTests run is already a fantastic GC test. * ForwardingHeaders/heap/DeleteAllCodeEffort.h: Added. * ForwardingHeaders/heap/LockDuringMarking.h: Added. * bindings/js/GCController.cpp: (WebCore::GCController::deleteAllCode): (WebCore::GCController::deleteAllLinkedCode): * bindings/js/GCController.h: * bindings/js/JSDOMBinding.cpp: (WebCore::getCachedDOMStructure): (WebCore::cacheDOMStructure): * bindings/js/JSDOMGlobalObject.cpp: (WebCore::JSDOMGlobalObject::addBuiltinGlobals): (WebCore::JSDOMGlobalObject::visitChildren): * bindings/js/JSDOMGlobalObject.h: (WebCore::getDOMConstructor): * bindings/js/JSDOMPromise.cpp: (WebCore::DeferredPromise::DeferredPromise): (WebCore::DeferredPromise::clear): * bindings/js/JSXPathResultCustom.cpp: (WebCore::JSXPathResult::visitAdditionalChildren): * dom/EventListenerMap.cpp: (WebCore::EventListenerMap::clear): (WebCore::EventListenerMap::replace): (WebCore::EventListenerMap::add): (WebCore::EventListenerMap::remove): (WebCore::EventListenerMap::find): (WebCore::EventListenerMap::removeFirstEventListenerCreatedFromMarkup): (WebCore::EventListenerMap::copyEventListenersNotCreatedFromMarkupToTarget): (WebCore::EventListenerIterator::EventListenerIterator): * dom/EventListenerMap.h: (WebCore::EventListenerMap::lock): * dom/EventTarget.cpp: (WebCore::EventTarget::visitJSEventListeners): * dom/EventTarget.h: (WebCore::EventTarget::visitJSEventListeners): Deleted. * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): * dom/Node.h: * page/MemoryRelease.cpp: (WebCore::releaseCriticalMemory): * page/cocoa/MemoryReleaseCocoa.mm: (WebCore::jettisonExpensiveObjectsOnTopLevelNavigation): (WebCore::registerMemoryReleaseNotifyCallbacks): Source/WTF: Concurrent GC should be stable enough to land enabled on X86_64 https://bugs.webkit.org/show_bug.cgi?id=164990 Reviewed by Geoffrey Garen. Adds the ability to say: auto locker = holdLock(any type of lock) Instead of having to say: Locker<LockType> locker(locks of type LockType) I think that we should use "auto locker = holdLock(lock)" as the default way that we acquire locks unless we need to use a special locker type. This also adds the ability to safepoint a lock. Safepointing a lock is basically a super fast way of unlocking it fairly and then immediately relocking it - i.e. letting anyone who is waiting to run without losing steam of there is noone waiting. * wtf/Lock.cpp: (WTF::LockBase::safepointSlow): * wtf/Lock.h: (WTF::LockBase::safepoint): * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::safepointFast): (WTF::LockAlgorithm::safepoint): (WTF::LockAlgorithm::safepointSlow): * wtf/Locker.h: (WTF::AbstractLocker::AbstractLocker): (WTF::Locker::tryLock): (WTF::Locker::operator bool): (WTF::Locker::Locker): (WTF::Locker::operator=): (WTF::holdLock): (WTF::tryHoldLock): Tools: Concurrent GC should be stable enough to land enabled https://bugs.webkit.org/show_bug.cgi?id=164990 Reviewed by Geoffrey Garen. Add a new mode that runs GC continuously. Also made eager modes run GC continuously. It's clear that this works just fine in release, but I'm still trying to figure out if it's safe for debug. It might be too slow for debug. * Scripts/run-jsc-stress-tests: Canonical link: https://commits.webkit.org/183229@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@209570 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-12-08 22:14:50 +00:00
static bool safepointFast(const Atomic<LockType>& lock)
{
WTF::compilerFence();
return !(lock.load(std::memory_order_relaxed) & hasParkedBit);
}
static void safepoint(Atomic<LockType>& lock)
{
if (UNLIKELY(!safepointFast(lock)))
safepointSlow(lock);
}
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
static bool isLocked(const Atomic<LockType>& lock)
{
return lock.load(std::memory_order_acquire) & isHeldBit;
}
It should be easy to decide how WebKit yields https://bugs.webkit.org/show_bug.cgi?id=174298 Reviewed by Saam Barati. Source/bmalloc: Use sched_yield() explicitly. * bmalloc/StaticMutex.cpp: (bmalloc::StaticMutex::lockSlowCase): Source/JavaScriptCore: Use the new WTF::Thread::yield() function for yielding instead of the C++ function. * heap/Heap.cpp: (JSC::Heap::resumeThePeriphery): * heap/VisitingTimeout.h: * runtime/JSCell.cpp: (JSC::JSCell::lockSlow): (JSC::JSCell::unlockSlow): * runtime/JSCell.h: * runtime/JSCellInlines.h: (JSC::JSCell::lock): (JSC::JSCell::unlock): * runtime/JSLock.cpp: (JSC::JSLock::grabAllLocks): * runtime/SamplingProfiler.cpp: Source/WebCore: No new tests because the WebCore change is just a change to how we #include things. * inspector/InspectorPageAgent.h: * inspector/TimelineRecordFactory.h: * workers/Worker.h: * workers/WorkerGlobalScopeProxy.h: * workers/WorkerMessagingProxy.h: Source/WebKitLegacy: * Storage/StorageTracker.h: Source/WTF: Created a Thread::yield() abstraction for sched_yield(), and made WTF use it everywhere that it had previously used std::this_thread::yield(). To make it less annoying to experiment with changes to the lock algorithm in the future, this also moves the meat of the algorithm into LockAlgorithmInlines.h. Only two files include that header. Since LockAlgorithm.h no longer includes ParkingLot.h, a bunch of files in WK now need to include timing headers (Seconds, MonotonicTime, etc) manually. * WTF.xcodeproj/project.pbxproj: * benchmarks/ToyLocks.h: * wtf/CMakeLists.txt: * wtf/Lock.cpp: * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockSlow): Deleted. (WTF::LockAlgorithm::unlockSlow): Deleted. * wtf/LockAlgorithmInlines.h: Added. (WTF::hasParkedBit>::lockSlow): (WTF::hasParkedBit>::unlockSlow): * wtf/MainThread.cpp: * wtf/RunLoopTimer.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPthreads.cpp: (WTF::Thread::yield): * wtf/ThreadingWin.cpp: (WTF::Thread::yield): * wtf/WordLock.cpp: (WTF::WordLockBase::lockSlow): (WTF::WordLockBase::unlockSlow): Canonical link: https://commits.webkit.org/191572@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@219763 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-07-22 14:36:18 +00:00
NEVER_INLINE static void lockSlow(Atomic<LockType>& lock);
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
enum Fairness {
It should be easy to decide how WebKit yields https://bugs.webkit.org/show_bug.cgi?id=174298 Reviewed by Saam Barati. Source/bmalloc: Use sched_yield() explicitly. * bmalloc/StaticMutex.cpp: (bmalloc::StaticMutex::lockSlowCase): Source/JavaScriptCore: Use the new WTF::Thread::yield() function for yielding instead of the C++ function. * heap/Heap.cpp: (JSC::Heap::resumeThePeriphery): * heap/VisitingTimeout.h: * runtime/JSCell.cpp: (JSC::JSCell::lockSlow): (JSC::JSCell::unlockSlow): * runtime/JSCell.h: * runtime/JSCellInlines.h: (JSC::JSCell::lock): (JSC::JSCell::unlock): * runtime/JSLock.cpp: (JSC::JSLock::grabAllLocks): * runtime/SamplingProfiler.cpp: Source/WebCore: No new tests because the WebCore change is just a change to how we #include things. * inspector/InspectorPageAgent.h: * inspector/TimelineRecordFactory.h: * workers/Worker.h: * workers/WorkerGlobalScopeProxy.h: * workers/WorkerMessagingProxy.h: Source/WebKitLegacy: * Storage/StorageTracker.h: Source/WTF: Created a Thread::yield() abstraction for sched_yield(), and made WTF use it everywhere that it had previously used std::this_thread::yield(). To make it less annoying to experiment with changes to the lock algorithm in the future, this also moves the meat of the algorithm into LockAlgorithmInlines.h. Only two files include that header. Since LockAlgorithm.h no longer includes ParkingLot.h, a bunch of files in WK now need to include timing headers (Seconds, MonotonicTime, etc) manually. * WTF.xcodeproj/project.pbxproj: * benchmarks/ToyLocks.h: * wtf/CMakeLists.txt: * wtf/Lock.cpp: * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockSlow): Deleted. (WTF::LockAlgorithm::unlockSlow): Deleted. * wtf/LockAlgorithmInlines.h: Added. (WTF::hasParkedBit>::lockSlow): (WTF::hasParkedBit>::unlockSlow): * wtf/MainThread.cpp: * wtf/RunLoopTimer.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPthreads.cpp: (WTF::Thread::yield): * wtf/ThreadingWin.cpp: (WTF::Thread::yield): * wtf/WordLock.cpp: (WTF::WordLockBase::lockSlow): (WTF::WordLockBase::unlockSlow): Canonical link: https://commits.webkit.org/191572@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@219763 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-07-22 14:36:18 +00:00
Unfair,
Fair
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
};
It should be easy to decide how WebKit yields https://bugs.webkit.org/show_bug.cgi?id=174298 Reviewed by Saam Barati. Source/bmalloc: Use sched_yield() explicitly. * bmalloc/StaticMutex.cpp: (bmalloc::StaticMutex::lockSlowCase): Source/JavaScriptCore: Use the new WTF::Thread::yield() function for yielding instead of the C++ function. * heap/Heap.cpp: (JSC::Heap::resumeThePeriphery): * heap/VisitingTimeout.h: * runtime/JSCell.cpp: (JSC::JSCell::lockSlow): (JSC::JSCell::unlockSlow): * runtime/JSCell.h: * runtime/JSCellInlines.h: (JSC::JSCell::lock): (JSC::JSCell::unlock): * runtime/JSLock.cpp: (JSC::JSLock::grabAllLocks): * runtime/SamplingProfiler.cpp: Source/WebCore: No new tests because the WebCore change is just a change to how we #include things. * inspector/InspectorPageAgent.h: * inspector/TimelineRecordFactory.h: * workers/Worker.h: * workers/WorkerGlobalScopeProxy.h: * workers/WorkerMessagingProxy.h: Source/WebKitLegacy: * Storage/StorageTracker.h: Source/WTF: Created a Thread::yield() abstraction for sched_yield(), and made WTF use it everywhere that it had previously used std::this_thread::yield(). To make it less annoying to experiment with changes to the lock algorithm in the future, this also moves the meat of the algorithm into LockAlgorithmInlines.h. Only two files include that header. Since LockAlgorithm.h no longer includes ParkingLot.h, a bunch of files in WK now need to include timing headers (Seconds, MonotonicTime, etc) manually. * WTF.xcodeproj/project.pbxproj: * benchmarks/ToyLocks.h: * wtf/CMakeLists.txt: * wtf/Lock.cpp: * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::lockSlow): Deleted. (WTF::LockAlgorithm::unlockSlow): Deleted. * wtf/LockAlgorithmInlines.h: Added. (WTF::hasParkedBit>::lockSlow): (WTF::hasParkedBit>::unlockSlow): * wtf/MainThread.cpp: * wtf/RunLoopTimer.h: * wtf/Threading.cpp: * wtf/Threading.h: * wtf/ThreadingPthreads.cpp: (WTF::Thread::yield): * wtf/ThreadingWin.cpp: (WTF::Thread::yield): * wtf/WordLock.cpp: (WTF::WordLockBase::lockSlow): (WTF::WordLockBase::unlockSlow): Canonical link: https://commits.webkit.org/191572@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@219763 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-07-22 14:36:18 +00:00
NEVER_INLINE static void unlockSlow(Atomic<LockType>& lock, Fairness fairness = Unfair);
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
PerformanceTests: Concurrent GC should be stable enough to land enabled https://bugs.webkit.org/show_bug.cgi?id=164990 Reviewed by Geoffrey Garen. Made CDjs more configurable and refined the "large.js" configuration. I was using that one and the new "long.js" configuration to tune concurrent eden GCs. Added a new way of running Splay in browser, which using chartjs to plot the execution times of 2000 iterations. This includes the minified chartjs. * JetStream/Octane2/splay-detail.html: Added. * JetStream/cdjs/benchmark.js: (benchmarkImpl): (benchmark): * JetStream/cdjs/long.js: Added. Source/JavaScriptCore: Concurrent GC should be stable enough to land enabled on X86_64 https://bugs.webkit.org/show_bug.cgi?id=164990 Reviewed by Geoffrey Garen. This fixes a ton of performance and correctness bugs revealed by getting the concurrent GC to be stable enough to land enabled. I had to redo the JSObject::visitChildren concurrency protocol again. This time I think it's even more correct than ever! This is an enormous win on JetStream/splay-latency and Octane/SplayLatency. It looks to be mostly neutral on everything else, though Speedometer is showing statistically weak signs of a slight regression. * API/JSAPIWrapperObject.mm: Added locking. (JSC::JSAPIWrapperObject::visitChildren): * API/JSCallbackObject.h: Added locking. (JSC::JSCallbackObjectData::visitChildren): (JSC::JSCallbackObjectData::JSPrivatePropertyMap::setPrivateProperty): (JSC::JSCallbackObjectData::JSPrivatePropertyMap::deletePrivateProperty): (JSC::JSCallbackObjectData::JSPrivatePropertyMap::visitChildren): * CMakeLists.txt: * JavaScriptCore.xcodeproj/project.pbxproj: * bytecode/CodeBlock.cpp: (JSC::CodeBlock::UnconditionalFinalizer::finalizeUnconditionally): This had a TOCTOU race on shouldJettisonDueToOldAge. (JSC::EvalCodeCache::visitAggregate): Moved to EvalCodeCache.cpp. * bytecode/DirectEvalCodeCache.cpp: Added. Outlined some functions and made them use locks. (JSC::DirectEvalCodeCache::setSlow): (JSC::DirectEvalCodeCache::clear): (JSC::DirectEvalCodeCache::visitAggregate): * bytecode/DirectEvalCodeCache.h: (JSC::DirectEvalCodeCache::set): (JSC::DirectEvalCodeCache::clear): Deleted. * bytecode/UnlinkedCodeBlock.cpp: Added locking. (JSC::UnlinkedCodeBlock::visitChildren): (JSC::UnlinkedCodeBlock::setInstructions): (JSC::UnlinkedCodeBlock::shrinkToFit): * bytecode/UnlinkedCodeBlock.h: Added locking. (JSC::UnlinkedCodeBlock::addRegExp): (JSC::UnlinkedCodeBlock::addConstant): (JSC::UnlinkedCodeBlock::addFunctionDecl): (JSC::UnlinkedCodeBlock::addFunctionExpr): (JSC::UnlinkedCodeBlock::createRareDataIfNecessary): (JSC::UnlinkedCodeBlock::shrinkToFit): Deleted. * debugger/Debugger.cpp: Use the right delete API. (JSC::Debugger::recompileAllJSFunctions): * dfg/DFGAbstractInterpreterInlines.h: (JSC::DFG::AbstractInterpreter<AbstractStateType>::executeEffects): Fix a pre-existing bug in ToFunction constant folding. * dfg/DFGClobberize.h: Add support for nuking. (JSC::DFG::clobberize): * dfg/DFGClobbersExitState.cpp: Add support for nuking. (JSC::DFG::clobbersExitState): * dfg/DFGFixupPhase.cpp: Add support for nuking. (JSC::DFG::FixupPhase::fixupNode): (JSC::DFG::FixupPhase::indexForChecks): (JSC::DFG::FixupPhase::originForCheck): (JSC::DFG::FixupPhase::speculateForBarrier): (JSC::DFG::FixupPhase::insertCheck): (JSC::DFG::FixupPhase::fixupChecksInBlock): * dfg/DFGSpeculativeJIT.cpp: Add support for nuking. (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): * ftl/FTLLowerDFGToB3.cpp: Add support for nuking. (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::nukeStructureAndSetButterfly): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): Deleted. * heap/CodeBlockSet.cpp: We need to be more careful about the CodeBlockSet workflow during GC, since we will allocate CodeBlocks in eden while collecting. (JSC::CodeBlockSet::clearMarksForFullCollection): (JSC::CodeBlockSet::deleteUnmarkedAndUnreferenced): * heap/Heap.cpp: Added code to measure max pauses. Added a better collectContinuously mode. (JSC::Heap::lastChanceToFinalize): Stop the collectContinuously thread. (JSC::Heap::harvestWeakReferences): Inline SlotVisitor::harvestWeakReferences. (JSC::Heap::finalizeUnconditionalFinalizers): Inline SlotVisitor::finalizeUnconditionalReferences. (JSC::Heap::markToFixpoint): We need to do some MarkedSpace stuff before every conservative scan, rather than just at the start of marking, so we now call prepareForConservativeScan() before each conservative scan. Also call a less-parallel version of drainInParallel when the mutator is running. (JSC::Heap::collectInThread): Inline Heap::prepareForAllocation(). (JSC::Heap::stopIfNecessarySlow): We need to be more careful about ensuring that we run finalization before and after stopping. Also, we should sanitize stack when stopping the world. (JSC::Heap::acquireAccessSlow): Add some optional debug prints. (JSC::Heap::handleNeedFinalize): Assert that we are running this when the world is not stopped. (JSC::Heap::finalize): Remove the old collectContinuously code. (JSC::Heap::requestCollection): We don't need to sanitize stack here anymore. (JSC::Heap::notifyIsSafeToCollect): Start the collectContinuously thread. It will request collection 1 KHz. (JSC::Heap::prepareForAllocation): Deleted. (JSC::Heap::preventCollection): Prevent any new concurrent GCs from being initiated. (JSC::Heap::allowCollection): (JSC::Heap::forEachSlotVisitor): Allows us to safely iterate slot visitors. * heap/Heap.h: * heap/HeapInlines.h: (JSC::Heap::writeBarrier): If the 'to' cell is not NewWhite then it could be AnthraciteOrBlack. During a full collection, objects may be AnthraciteOrBlack from a previous GC. Turns out, we don't benefit from this optimization so we can just kill it. * heap/HeapSnapshotBuilder.cpp: (JSC::HeapSnapshotBuilder::buildSnapshot): This needs to use PreventCollectionScope to ensure snapshot soundness. * heap/ListableHandler.h: (JSC::ListableHandler::isOnList): Useful helper. * heap/LockDuringMarking.h: (JSC::lockDuringMarking): It's a locker that only locks while we're marking. * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::addBlock): Hold the bitvector lock while resizing. * heap/MarkedBlock.cpp: Hold the bitvector lock while accessing the bitvectors while the mutator is running. * heap/MarkedSpace.cpp: (JSC::MarkedSpace::prepareForConservativeScan): We used to do this in prepareForMarking, but we need to do it before each conservative scan not just before marking. (JSC::MarkedSpace::prepareForMarking): Remove the logic moved to prepareForConservativeScan. * heap/MarkedSpace.h: * heap/PreventCollectionScope.h: Added. * heap/SlotVisitor.cpp: Refactored drainFromShared so that we can write a similar function called drainInParallelPassively. (JSC::SlotVisitor::updateMutatorIsStopped): Update whether we can use "fast" scanning. (JSC::SlotVisitor::mutatorIsStoppedIsUpToDate): (JSC::SlotVisitor::didReachTermination): (JSC::SlotVisitor::hasWork): (JSC::SlotVisitor::drain): This now uses the rightToRun lock to allow the main GC thread to safepoint the workers. (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::drainInParallelPassively): This runs marking with one fewer threads than normal. It's useful for when we have resumed the mutator, since then the mutator has a better chance of getting on a core. (JSC::SlotVisitor::addWeakReferenceHarvester): (JSC::SlotVisitor::addUnconditionalFinalizer): (JSC::SlotVisitor::harvestWeakReferences): Deleted. (JSC::SlotVisitor::finalizeUnconditionalFinalizers): Deleted. * heap/SlotVisitor.h: * heap/SlotVisitorInlines.h: Outline stuff. (JSC::SlotVisitor::addWeakReferenceHarvester): Deleted. (JSC::SlotVisitor::addUnconditionalFinalizer): Deleted. * runtime/InferredType.cpp: This needed thread safety. (JSC::InferredType::visitChildren): This needs to keep its structure finalizer alive until it runs. (JSC::InferredType::set): (JSC::InferredType::InferredStructureFinalizer::finalizeUnconditionally): * runtime/InferredType.h: * runtime/InferredValue.cpp: This needed thread safety. (JSC::InferredValue::visitChildren): (JSC::InferredValue::ValueCleanup::finalizeUnconditionally): * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): Update to use new butterfly API. (JSC::JSArray::unshiftCountWithArrayStorage): Update to use new butterfly API. * runtime/JSArrayBufferView.cpp: (JSC::JSArrayBufferView::visitChildren): Thread safety. * runtime/JSCell.h: (JSC::JSCell::setStructureIDDirectly): This is used for nuking the structure. (JSC::JSCell::InternalLocker::InternalLocker): Deleted. The cell is now the lock. (JSC::JSCell::InternalLocker::~InternalLocker): Deleted. The cell is now the lock. * runtime/JSCellInlines.h: (JSC::JSCell::structure): Clean this up. (JSC::JSCell::lock): The cell is now the lock. (JSC::JSCell::tryLock): (JSC::JSCell::unlock): (JSC::JSCell::isLocked): (JSC::JSCell::lockInternalLock): Deleted. (JSC::JSCell::unlockInternalLock): Deleted. * runtime/JSFunction.cpp: (JSC::JSFunction::visitChildren): Thread safety. * runtime/JSGenericTypedArrayViewInlines.h: (JSC::JSGenericTypedArrayView<Adaptor>::visitChildren): Thread safety. (JSC::JSGenericTypedArrayView<Adaptor>::slowDownAndWasteMemory): Thread safety. * runtime/JSObject.cpp: (JSC::JSObject::markAuxiliaryAndVisitOutOfLineProperties): Factor out this "easy" step of butterfly visiting. (JSC::JSObject::visitButterfly): Make this achieve 100% precision about structure-butterfly relationships. This relies on the mutator "nuking" the structure prior to "locked" structure-butterfly transitions. (JSC::JSObject::visitChildren): Use the new, nicer API. (JSC::JSFinalObject::visitChildren): Use the new, nicer API. (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): Use the new butterfly API. (JSC::JSObject::createInitialUndecided): Use the new butterfly API. (JSC::JSObject::createInitialInt32): Use the new butterfly API. (JSC::JSObject::createInitialDouble): Use the new butterfly API. (JSC::JSObject::createInitialContiguous): Use the new butterfly API. (JSC::JSObject::createArrayStorage): Use the new butterfly API. (JSC::JSObject::convertUndecidedToContiguous): Use the new butterfly API. (JSC::JSObject::convertUndecidedToArrayStorage): Use the new butterfly API. (JSC::JSObject::convertInt32ToArrayStorage): Use the new butterfly API. (JSC::JSObject::convertDoubleToContiguous): Use the new butterfly API. (JSC::JSObject::convertDoubleToArrayStorage): Use the new butterfly API. (JSC::JSObject::convertContiguousToArrayStorage): Use the new butterfly API. (JSC::JSObject::increaseVectorLength): Use the new butterfly API. (JSC::JSObject::shiftButterflyAfterFlattening): Use the new butterfly API. * runtime/JSObject.h: (JSC::JSObject::setButterfly): This now does all of the fences. Only use this when you are not also transitioning the structure or the structure's lastOffset. (JSC::JSObject::nukeStructureAndSetButterfly): Use this when doing locked structure-butterfly transitions. * runtime/JSObjectInlines.h: (JSC::JSObject::putDirectWithoutTransition): Use the newly factored out API. (JSC::JSObject::prepareToPutDirectWithoutTransition): Factor this out! (JSC::JSObject::putDirectInternal): Use the newly factored out API. * runtime/JSPropertyNameEnumerator.cpp: (JSC::JSPropertyNameEnumerator::finishCreation): Locks! (JSC::JSPropertyNameEnumerator::visitChildren): Locks! * runtime/JSSegmentedVariableObject.cpp: (JSC::JSSegmentedVariableObject::visitChildren): Locks! * runtime/JSString.cpp: (JSC::JSString::visitChildren): Thread safety. * runtime/ModuleProgramExecutable.cpp: (JSC::ModuleProgramExecutable::visitChildren): Thread safety. * runtime/Options.cpp: For now we disable concurrent GC on not-X86_64. (JSC::recomputeDependentOptions): * runtime/Options.h: Change the default max GC parallelism to 8. I don't know why it was still 7. * runtime/SamplingProfiler.cpp: (JSC::SamplingProfiler::stackTracesAsJSON): This needs to defer GC before grabbing its lock. * runtime/SparseArrayValueMap.cpp: This needed thread safety. (JSC::SparseArrayValueMap::add): (JSC::SparseArrayValueMap::remove): (JSC::SparseArrayValueMap::visitChildren): * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: This had a race between addNewPropertyTransition and visitChildren. (JSC::Structure::Structure): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::add): Help out with nuking support - the m_offset needs to play along. (JSC::Structure::visitChildren): * runtime/Structure.h: Make some useful things public - like the notion of a lastOffset. * runtime/StructureChain.cpp: (JSC::StructureChain::visitChildren): Thread safety! * runtime/StructureChain.h: Thread safety! * runtime/StructureIDTable.cpp: (JSC::StructureIDTable::allocateID): Ensure that we don't get nuked IDs. * runtime/StructureIDTable.h: Add the notion of a nuked ID! It's a bit that the runtime never sees except during specific shady actions like locked structure-butterfly transitions. "Nuking" tells the GC to steer clear and rescan once we fire the barrier. (JSC::nukedStructureIDBit): (JSC::nuke): (JSC::isNuked): (JSC::decontaminate): * runtime/StructureInlines.h: (JSC::Structure::hasIndexingHeader): Better API. (JSC::Structure::add): * runtime/VM.cpp: Better GC interaction. (JSC::VM::ensureWatchdog): (JSC::VM::deleteAllLinkedCode): (JSC::VM::deleteAllCode): * runtime/VM.h: (JSC::VM::getStructure): Why wasn't this always an API! * runtime/WebAssemblyExecutable.cpp: (JSC::WebAssemblyExecutable::visitChildren): Thread safety. Source/WebCore: Concurrent GC should be stable enough to land enabled on X86_64 https://bugs.webkit.org/show_bug.cgi?id=164990 Reviewed by Geoffrey Garen. Made WebCore down with concurrent marking by adding some locking and adapting to some new API. This has new test modes in run-sjc-stress-tests. Also, the way that LayoutTests run is already a fantastic GC test. * ForwardingHeaders/heap/DeleteAllCodeEffort.h: Added. * ForwardingHeaders/heap/LockDuringMarking.h: Added. * bindings/js/GCController.cpp: (WebCore::GCController::deleteAllCode): (WebCore::GCController::deleteAllLinkedCode): * bindings/js/GCController.h: * bindings/js/JSDOMBinding.cpp: (WebCore::getCachedDOMStructure): (WebCore::cacheDOMStructure): * bindings/js/JSDOMGlobalObject.cpp: (WebCore::JSDOMGlobalObject::addBuiltinGlobals): (WebCore::JSDOMGlobalObject::visitChildren): * bindings/js/JSDOMGlobalObject.h: (WebCore::getDOMConstructor): * bindings/js/JSDOMPromise.cpp: (WebCore::DeferredPromise::DeferredPromise): (WebCore::DeferredPromise::clear): * bindings/js/JSXPathResultCustom.cpp: (WebCore::JSXPathResult::visitAdditionalChildren): * dom/EventListenerMap.cpp: (WebCore::EventListenerMap::clear): (WebCore::EventListenerMap::replace): (WebCore::EventListenerMap::add): (WebCore::EventListenerMap::remove): (WebCore::EventListenerMap::find): (WebCore::EventListenerMap::removeFirstEventListenerCreatedFromMarkup): (WebCore::EventListenerMap::copyEventListenersNotCreatedFromMarkupToTarget): (WebCore::EventListenerIterator::EventListenerIterator): * dom/EventListenerMap.h: (WebCore::EventListenerMap::lock): * dom/EventTarget.cpp: (WebCore::EventTarget::visitJSEventListeners): * dom/EventTarget.h: (WebCore::EventTarget::visitJSEventListeners): Deleted. * dom/Node.cpp: (WebCore::Node::eventTargetDataConcurrently): (WebCore::Node::ensureEventTargetData): (WebCore::Node::clearEventTargetData): * dom/Node.h: * page/MemoryRelease.cpp: (WebCore::releaseCriticalMemory): * page/cocoa/MemoryReleaseCocoa.mm: (WebCore::jettisonExpensiveObjectsOnTopLevelNavigation): (WebCore::registerMemoryReleaseNotifyCallbacks): Source/WTF: Concurrent GC should be stable enough to land enabled on X86_64 https://bugs.webkit.org/show_bug.cgi?id=164990 Reviewed by Geoffrey Garen. Adds the ability to say: auto locker = holdLock(any type of lock) Instead of having to say: Locker<LockType> locker(locks of type LockType) I think that we should use "auto locker = holdLock(lock)" as the default way that we acquire locks unless we need to use a special locker type. This also adds the ability to safepoint a lock. Safepointing a lock is basically a super fast way of unlocking it fairly and then immediately relocking it - i.e. letting anyone who is waiting to run without losing steam of there is noone waiting. * wtf/Lock.cpp: (WTF::LockBase::safepointSlow): * wtf/Lock.h: (WTF::LockBase::safepoint): * wtf/LockAlgorithm.h: (WTF::LockAlgorithm::safepointFast): (WTF::LockAlgorithm::safepoint): (WTF::LockAlgorithm::safepointSlow): * wtf/Locker.h: (WTF::AbstractLocker::AbstractLocker): (WTF::Locker::tryLock): (WTF::Locker::operator bool): (WTF::Locker::Locker): (WTF::Locker::operator=): (WTF::holdLock): (WTF::tryHoldLock): Tools: Concurrent GC should be stable enough to land enabled https://bugs.webkit.org/show_bug.cgi?id=164990 Reviewed by Geoffrey Garen. Add a new mode that runs GC continuously. Also made eager modes run GC continuously. It's clear that this works just fine in release, but I'm still trying to figure out if it's safe for debug. It might be too slow for debug. * Scripts/run-jsc-stress-tests: Canonical link: https://commits.webkit.org/183229@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@209570 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-12-08 22:14:50 +00:00
NEVER_INLINE static void safepointSlow(Atomic<LockType>& lockWord)
{
unlockFairly(lockWord);
lock(lockWord);
}
The GC should be optionally concurrent and disabled by default https://bugs.webkit.org/show_bug.cgi?id=164454 Reviewed by Geoffrey Garen. Source/JavaScriptCore: This started out as a patch to have the GC scan the stack at the end, and then the outage happened and I decided to pick a more aggresive target: give the GC a concurrent mode that can be enabled at runtime, and whose only effect is that it turns on the ResumeTheWorldScope. This gives our GC a really intuitive workflow: by default, the GC thread is running solo with the world stopped and the parallel markers converged and waiting. We have a parallel work scope to enable the parallel markers and now we have a ResumeTheWorldScope that will optionally resume the world and then stop it again. It's easy to make a concurrent GC that always instantly crashes. I can't promise that this one won't do that when you run it. I set a specific goal: I wanted to do >10 concurrent GCs in debug mode with generations, optimizing JITs, and parallel marking disabled. To reach this milestone, I needed to do a bunch of stuff: - The mutator needs a separate mark stack for the barrier, since it will mutate this stack concurrently to the collector's slot visitors. - The use of CellState to indicate whether an object is being scanned the first time or a subsequent time was racy. It fails spectacularly when a barrier is fired at the same time as visitChildren is running or if the barrier runs at the same time as the GC marks the same object. So, I split SlotVisitor's mark stacks. It's now the case that you know why you're being scanned by looking at which stack you came off of. - All of root marking must be in the collector fixpoint. I renamed markRoots to markToFixpoint. They say concurrency is hard, but the collector looks more intuitive this way. We never gained anything from forcing people to make a choice between scanning something in the fixpoint versus outside of it. Because root scanning is cheap, we can afford to do it repeatedly, which means all root scanning can now do constraint-based marking (like: I'll mark you if that thing is marked). - JSObject::visitChildren's scanning of the butterfly raced with property additions, indexed storage transitions and resizing, and a bunch of miscellaneous dirty butterfly reshaping functions - like the one that flattens a dictionary and some sneaky ArrayStorage transformations. Many of these can be fixed by using store-store fences in the mutator and load-load fences in the collector. I've adopted the rule that the collector must always see either a butterfly and structure that match or a newer butterfly with an older structure, where their age is just one transition apart. This can be achieved with fences. For the cases where it breaks down, I added a lock to every JSCell. This is a full-fledged WTF lock that we sneak into two available bits in the indexingType. See the WTF ChangeLog for details. The mutator fencing rules are as follows: - Store-store fence before and after setting the butterfly. - Store-store fence before setting structure if you had changed the shape of the butterfly. - Store-store fence after initializing all fields in an allocation. - A dictionary Structure can change in strange ways while the GC is trying to scan it. So, JSObject::visitChildren will now grab the object's structure's lock if the object's structure is a dictionary. Dictionary structures are 1:1 with their object, so this does not reduce GC parallelism (super unlikely that the GC will simultaneously scan an object from two threads). - The GC can blow away a Structure's property table at any time. As a small consolation, it's now holding the Structure's lock when it does so. But there was tons of code in Structure that uses DeferGC to prevent the GC from blowing away the property table. This doesn't work with concurrent GC, since DeferGC only means that the GC won't run its safepoint (i.e. stop-the-world code) in the DeferGC region. It will still do marking and it was the Structure::visitChildren that would delete the table. It turns out that Structure's reliance on the property table not being deleted was the product of code rot. We already had functions that would materialize the table on demand. We were simply making the mistake of saying: structure->materializePropertyMap(); ... structure->propertyTable()->things Instead of saying: PropertyTable* table = structure->ensurePropertyTable(); ... table->things Switching the code to use the latter idiom allowed me to simplify the code a lot while fixing the race. - The LLInt's get_by_val handling was broken because the indexing shape constants were wrong. Once I started putting more things into the IndexingType, that started causing crashes for me. So I fixed LLInt. That turned out to be a lot of work, since that code had rotted in subtle ways. This is a speed-up in SunSpider, probably because of the LLInt fix. This is neutral on Octane and Kraken. It's a smaller slow-down on LongSpider, but I think we can ignore that (we don't view LongSpider as an official benchmark). By default, the concurrent GC is disabled: in all of the places where it would have resumed the world to run marking concurrently to the mutator, it will just skip the resume step. When you enable concurrent GC (--useConcurrentGC=true), it can sometimes run Octane/splay to completion. It seems to perform quite well: on my machine, it improves both splay-throughput and splay-latency. It's probably unstable for other programs. * API/JSVirtualMachine.mm: (-[JSVirtualMachine isOldExternalObject:]): * assembler/MacroAssemblerARMv7.h: (JSC::MacroAssemblerARMv7::storeFence): * bytecode/InlineAccess.cpp: (JSC::InlineAccess::dumpCacheSizesAndCrash): (JSC::InlineAccess::generateSelfPropertyAccess): (JSC::InlineAccess::generateArrayLength): * bytecode/ObjectAllocationProfile.h: (JSC::ObjectAllocationProfile::offsetOfInlineCapacity): (JSC::ObjectAllocationProfile::ObjectAllocationProfile): (JSC::ObjectAllocationProfile::initialize): (JSC::ObjectAllocationProfile::inlineCapacity): (JSC::ObjectAllocationProfile::clear): * bytecode/PolymorphicAccess.cpp: (JSC::AccessCase::generateWithGuard): (JSC::AccessCase::generateImpl): * dfg/DFGArrayifySlowPathGenerator.h: * dfg/DFGClobberize.h: (JSC::DFG::clobberize): * dfg/DFGOSRExitCompiler32_64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOSRExitCompiler64.cpp: (JSC::DFG::OSRExitCompiler::compileExit): * dfg/DFGOperations.cpp: * dfg/DFGPlan.cpp: (JSC::DFG::Plan::markCodeBlocks): (JSC::DFG::Plan::rememberCodeBlocks): * dfg/DFGPlan.h: * dfg/DFGSpeculativeJIT.cpp: (JSC::DFG::SpeculativeJIT::emitAllocateRawObject): (JSC::DFG::SpeculativeJIT::checkArray): (JSC::DFG::SpeculativeJIT::arrayify): (JSC::DFG::SpeculativeJIT::compileMakeRope): (JSC::DFG::SpeculativeJIT::compileNewFunctionCommon): (JSC::DFG::SpeculativeJIT::compileCreateActivation): (JSC::DFG::SpeculativeJIT::compileCreateDirectArguments): (JSC::DFG::SpeculativeJIT::compileSpread): (JSC::DFG::SpeculativeJIT::compileAllocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileReallocatePropertyStorage): (JSC::DFG::SpeculativeJIT::compileNewStringObject): (JSC::DFG::SpeculativeJIT::compileNewTypedArray): (JSC::DFG::SpeculativeJIT::compileStoreBarrier): * dfg/DFGSpeculativeJIT64.cpp: (JSC::DFG::SpeculativeJIT::compile): (JSC::DFG::SpeculativeJIT::compileAllocateNewArrayWithSize): * dfg/DFGTierUpCheckInjectionPhase.cpp: (JSC::DFG::TierUpCheckInjectionPhase::run): * dfg/DFGWorklist.cpp: (JSC::DFG::Worklist::markCodeBlocks): (JSC::DFG::Worklist::rememberCodeBlocks): (JSC::DFG::markCodeBlocks): (JSC::DFG::completeAllPlansForVM): (JSC::DFG::rememberCodeBlocks): * dfg/DFGWorklist.h: * ftl/FTLAbstractHeapRepository.cpp: (JSC::FTL::AbstractHeapRepository::AbstractHeapRepository): (JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions): * ftl/FTLAbstractHeapRepository.h: * ftl/FTLJITCode.cpp: (JSC::FTL::JITCode::~JITCode): * ftl/FTLLowerDFGToB3.cpp: (JSC::FTL::DFG::LowerDFGToB3::compilePutStructure): (JSC::FTL::DFG::LowerDFGToB3::compileCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::compileNewFunction): (JSC::FTL::DFG::LowerDFGToB3::compileCreateDirectArguments): (JSC::FTL::DFG::LowerDFGToB3::compileCreateRest): (JSC::FTL::DFG::LowerDFGToB3::compileNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileNewArray): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSpread): (JSC::FTL::DFG::LowerDFGToB3::compileSpread): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayBuffer): (JSC::FTL::DFG::LowerDFGToB3::compileNewArrayWithSize): (JSC::FTL::DFG::LowerDFGToB3::compileNewTypedArray): (JSC::FTL::DFG::LowerDFGToB3::compileMakeRope): (JSC::FTL::DFG::LowerDFGToB3::compileMultiPutByOffset): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeNewObject): (JSC::FTL::DFG::LowerDFGToB3::compileMaterializeCreateActivation): (JSC::FTL::DFG::LowerDFGToB3::splatWords): (JSC::FTL::DFG::LowerDFGToB3::allocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::reallocatePropertyStorage): (JSC::FTL::DFG::LowerDFGToB3::allocateObject): (JSC::FTL::DFG::LowerDFGToB3::isArrayType): (JSC::FTL::DFG::LowerDFGToB3::emitStoreBarrier): (JSC::FTL::DFG::LowerDFGToB3::mutatorFence): (JSC::FTL::DFG::LowerDFGToB3::setButterfly): * ftl/FTLOSRExitCompiler.cpp: (JSC::FTL::compileStub): * ftl/FTLOutput.cpp: (JSC::FTL::Output::signExt32ToPtr): (JSC::FTL::Output::fence): * ftl/FTLOutput.h: * heap/CellState.h: * heap/GCSegmentedArray.h: * heap/Heap.cpp: (JSC::Heap::ResumeTheWorldScope::ResumeTheWorldScope): (JSC::Heap::ResumeTheWorldScope::~ResumeTheWorldScope): (JSC::Heap::Heap): (JSC::Heap::~Heap): (JSC::Heap::harvestWeakReferences): (JSC::Heap::finalizeUnconditionalFinalizers): (JSC::Heap::completeAllJITPlans): (JSC::Heap::markToFixpoint): (JSC::Heap::gatherStackRoots): (JSC::Heap::beginMarking): (JSC::Heap::visitConservativeRoots): (JSC::Heap::visitCompilerWorklistWeakReferences): (JSC::Heap::updateObjectCounts): (JSC::Heap::endMarking): (JSC::Heap::addToRememberedSet): (JSC::Heap::collectInThread): (JSC::Heap::stopTheWorld): (JSC::Heap::resumeTheWorld): (JSC::Heap::setGCDidJIT): (JSC::Heap::setNeedFinalize): (JSC::Heap::setMutatorWaiting): (JSC::Heap::clearMutatorWaiting): (JSC::Heap::finalize): (JSC::Heap::flushWriteBarrierBuffer): (JSC::Heap::writeBarrierSlowPath): (JSC::Heap::canCollect): (JSC::Heap::reportExtraMemoryVisited): (JSC::Heap::reportExternalMemoryVisited): (JSC::Heap::notifyIsSafeToCollect): (JSC::Heap::markRoots): Deleted. (JSC::Heap::visitExternalRememberedSet): Deleted. (JSC::Heap::visitSmallStrings): Deleted. (JSC::Heap::visitProtectedObjects): Deleted. (JSC::Heap::visitArgumentBuffers): Deleted. (JSC::Heap::visitException): Deleted. (JSC::Heap::visitStrongHandles): Deleted. (JSC::Heap::visitHandleStack): Deleted. (JSC::Heap::visitSamplingProfiler): Deleted. (JSC::Heap::visitTypeProfiler): Deleted. (JSC::Heap::visitShadowChicken): Deleted. (JSC::Heap::traceCodeBlocksAndJITStubRoutines): Deleted. (JSC::Heap::visitWeakHandles): Deleted. (JSC::Heap::flushOldStructureIDTables): Deleted. (JSC::Heap::stopAllocation): Deleted. * heap/Heap.h: (JSC::Heap::collectorSlotVisitor): (JSC::Heap::mutatorMarkStack): (JSC::Heap::mutatorShouldBeFenced): (JSC::Heap::addressOfMutatorShouldBeFenced): (JSC::Heap::slotVisitor): Deleted. (JSC::Heap::notifyIsSafeToCollect): Deleted. (JSC::Heap::barrierShouldBeFenced): Deleted. (JSC::Heap::addressOfBarrierShouldBeFenced): Deleted. * heap/MarkStack.cpp: (JSC::MarkStackArray::transferTo): * heap/MarkStack.h: * heap/MarkedAllocator.cpp: (JSC::MarkedAllocator::tryAllocateIn): * heap/MarkedBlock.cpp: (JSC::MarkedBlock::MarkedBlock): (JSC::MarkedBlock::Handle::specializedSweep): (JSC::MarkedBlock::Handle::sweep): (JSC::MarkedBlock::Handle::sweepHelperSelectMarksMode): (JSC::MarkedBlock::Handle::stopAllocating): (JSC::MarkedBlock::Handle::resumeAllocating): (JSC::MarkedBlock::aboutToMarkSlow): (JSC::MarkedBlock::Handle::didConsumeFreeList): (JSC::SetNewlyAllocatedFunctor::SetNewlyAllocatedFunctor): Deleted. (JSC::SetNewlyAllocatedFunctor::operator()): Deleted. * heap/MarkedBlock.h: * heap/MarkedSpace.cpp: (JSC::MarkedSpace::resumeAllocating): * heap/SlotVisitor.cpp: (JSC::SlotVisitor::SlotVisitor): (JSC::SlotVisitor::~SlotVisitor): (JSC::SlotVisitor::reset): (JSC::SlotVisitor::clearMarkStacks): (JSC::SlotVisitor::appendJSCellOrAuxiliary): (JSC::SlotVisitor::setMarkedAndAppendToMarkStack): (JSC::SlotVisitor::appendToMarkStack): (JSC::SlotVisitor::appendToMutatorMarkStack): (JSC::SlotVisitor::visitChildren): (JSC::SlotVisitor::donateKnownParallel): (JSC::SlotVisitor::drain): (JSC::SlotVisitor::drainFromShared): (JSC::SlotVisitor::containsOpaqueRoot): (JSC::SlotVisitor::donateAndDrain): (JSC::SlotVisitor::mergeOpaqueRoots): (JSC::SlotVisitor::dump): (JSC::SlotVisitor::clearMarkStack): Deleted. (JSC::SlotVisitor::opaqueRootCount): Deleted. * heap/SlotVisitor.h: (JSC::SlotVisitor::collectorMarkStack): (JSC::SlotVisitor::mutatorMarkStack): (JSC::SlotVisitor::isEmpty): (JSC::SlotVisitor::bytesVisited): (JSC::SlotVisitor::markStack): Deleted. (JSC::SlotVisitor::bytesCopied): Deleted. * heap/SlotVisitorInlines.h: (JSC::SlotVisitor::reportExtraMemoryVisited): (JSC::SlotVisitor::reportExternalMemoryVisited): * jit/AssemblyHelpers.cpp: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): * jit/AssemblyHelpers.h: (JSC::AssemblyHelpers::emitStoreStructureWithTypeInfo): (JSC::AssemblyHelpers::barrierStoreLoadFence): (JSC::AssemblyHelpers::mutatorFence): (JSC::AssemblyHelpers::storeButterfly): (JSC::AssemblyHelpers::jumpIfMutatorFenceNotNeeded): (JSC::AssemblyHelpers::emitInitializeInlineStorage): (JSC::AssemblyHelpers::emitInitializeOutOfLineStorage): (JSC::AssemblyHelpers::jumpIfBarrierStoreLoadFenceNotNeeded): Deleted. * jit/JITInlines.h: (JSC::JIT::emitArrayProfilingSiteWithCell): * jit/JITOperations.cpp: * jit/JITPropertyAccess.cpp: (JSC::JIT::emit_op_put_to_scope): (JSC::JIT::emit_op_put_to_arguments): * llint/LLIntData.cpp: (JSC::LLInt::Data::performAssertions): * llint/LowLevelInterpreter.asm: * llint/LowLevelInterpreter64.asm: * runtime/ButterflyInlines.h: (JSC::Butterfly::create): (JSC::Butterfly::createOrGrowPropertyStorage): * runtime/ConcurrentJITLock.h: (JSC::GCSafeConcurrentJITLocker::NoDefer::NoDefer): Deleted. * runtime/GenericArgumentsInlines.h: (JSC::GenericArguments<Type>::getOwnPropertySlotByIndex): (JSC::GenericArguments<Type>::putByIndex): * runtime/IndexingType.h: * runtime/JSArray.cpp: (JSC::JSArray::unshiftCountSlowCase): (JSC::JSArray::unshiftCountWithArrayStorage): * runtime/JSCell.h: (JSC::JSCell::InternalLocker::InternalLocker): (JSC::JSCell::InternalLocker::~InternalLocker): (JSC::JSCell::atomicCompareExchangeCellStateWeakRelaxed): (JSC::JSCell::atomicCompareExchangeCellStateStrong): (JSC::JSCell::indexingTypeAndMiscOffset): (JSC::JSCell::indexingTypeOffset): Deleted. * runtime/JSCellInlines.h: (JSC::JSCell::JSCell): (JSC::JSCell::finishCreation): (JSC::JSCell::indexingTypeAndMisc): (JSC::JSCell::indexingType): (JSC::JSCell::setStructure): (JSC::JSCell::callDestructor): (JSC::JSCell::lockInternalLock): (JSC::JSCell::unlockInternalLock): * runtime/JSObject.cpp: (JSC::JSObject::visitButterfly): (JSC::JSObject::visitChildren): (JSC::JSFinalObject::visitChildren): (JSC::JSObject::enterDictionaryIndexingModeWhenArrayStorageAlreadyExists): (JSC::JSObject::createInitialUndecided): (JSC::JSObject::createInitialInt32): (JSC::JSObject::createInitialDouble): (JSC::JSObject::createInitialContiguous): (JSC::JSObject::createArrayStorage): (JSC::JSObject::convertUndecidedToArrayStorage): (JSC::JSObject::convertInt32ToArrayStorage): (JSC::JSObject::convertDoubleToArrayStorage): (JSC::JSObject::convertContiguousToArrayStorage): (JSC::JSObject::deleteProperty): (JSC::JSObject::defineOwnIndexedProperty): (JSC::JSObject::increaseVectorLength): (JSC::JSObject::ensureLengthSlow): (JSC::JSObject::reallocateAndShrinkButterfly): (JSC::JSObject::allocateMoreOutOfLineStorage): (JSC::JSObject::shiftButterflyAfterFlattening): (JSC::JSObject::growOutOfLineStorage): Deleted. * runtime/JSObject.h: (JSC::JSFinalObject::JSFinalObject): (JSC::JSObject::setButterfly): (JSC::JSObject::getOwnNonIndexPropertySlot): (JSC::JSObject::fillCustomGetterPropertySlot): (JSC::JSObject::getOwnPropertySlot): (JSC::JSObject::getPropertySlot): (JSC::JSObject::setStructureAndButterfly): Deleted. (JSC::JSObject::setButterflyWithoutChangingStructure): Deleted. (JSC::JSObject::putDirectInternal): Deleted. (JSC::JSObject::putDirectWithoutTransition): Deleted. * runtime/JSObjectInlines.h: (JSC::JSObject::getPropertySlot): (JSC::JSObject::getNonIndexPropertySlot): (JSC::JSObject::putDirectWithoutTransition): (JSC::JSObject::putDirectInternal): * runtime/Options.h: * runtime/SparseArrayValueMap.h: * runtime/Structure.cpp: (JSC::Structure::dumpStatistics): (JSC::Structure::findStructuresAndMapForMaterialization): (JSC::Structure::materializePropertyTable): (JSC::Structure::addNewPropertyTransition): (JSC::Structure::changePrototypeTransition): (JSC::Structure::attributeChangeTransition): (JSC::Structure::toDictionaryTransition): (JSC::Structure::takePropertyTableOrCloneIfPinned): (JSC::Structure::nonPropertyTransition): (JSC::Structure::isSealed): (JSC::Structure::isFrozen): (JSC::Structure::flattenDictionaryStructure): (JSC::Structure::pin): (JSC::Structure::pinForCaching): (JSC::Structure::willStoreValueSlow): (JSC::Structure::copyPropertyTableForPinning): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::getPropertyNamesFromStructure): (JSC::Structure::visitChildren): (JSC::Structure::materializePropertyMap): Deleted. (JSC::Structure::addPropertyWithoutTransition): Deleted. (JSC::Structure::removePropertyWithoutTransition): Deleted. (JSC::Structure::copyPropertyTable): Deleted. (JSC::Structure::createPropertyMap): Deleted. (JSC::PropertyTable::checkConsistency): Deleted. (JSC::Structure::checkConsistency): Deleted. * runtime/Structure.h: * runtime/StructureIDBlob.h: (JSC::StructureIDBlob::StructureIDBlob): (JSC::StructureIDBlob::indexingTypeIncludingHistory): (JSC::StructureIDBlob::setIndexingTypeIncludingHistory): (JSC::StructureIDBlob::indexingTypeIncludingHistoryOffset): (JSC::StructureIDBlob::indexingType): Deleted. (JSC::StructureIDBlob::setIndexingType): Deleted. (JSC::StructureIDBlob::indexingTypeOffset): Deleted. * runtime/StructureInlines.h: (JSC::Structure::get): (JSC::Structure::checkOffsetConsistency): (JSC::Structure::checkConsistency): (JSC::Structure::add): (JSC::Structure::remove): (JSC::Structure::addPropertyWithoutTransition): (JSC::Structure::removePropertyWithoutTransition): (JSC::Structure::setPropertyTable): (JSC::Structure::putWillGrowOutOfLineStorage): Deleted. (JSC::Structure::propertyTable): Deleted. (JSC::Structure::suggestedNewOutOfLineStorageCapacity): Deleted. Source/WTF: The reason why I went to such great pains to make WTF::Lock fit in two bits is that I knew that I would eventually need to stuff one into some miscellaneous bits of the JSCell header. That time has come, because the concurrent GC has numerous race conditions in visitChildren that can be trivially fixed if each object just has an internal lock. Some cell types might use it to simply protect their entire visitChildren function and anything that mutates the fields it touches, while other cell types might use it as a "lock of last resort" to handle corner cases of an otherwise wait-free or lock-free algorithm. Right now, it's used to protect certain transformations involving indexing storage. To make this happen, I factored the WTF::Lock algorithm into a LockAlgorithm struct that is templatized on lock type (uint8_t for WTF::Lock), the isHeldBit value (1 for WTF::Lock), and the hasParkedBit value (2 for WTF::Lock). This could have been done as a templatized Lock class that basically contains Atomic<LockType>. You could then make any field into a lock by bitwise_casting it to TemplateLock<field type, bit1, bit2>. But this felt too dirty, so instead, LockAlgorithm has static methods that take Atomic<LockType>& as their first argument. I think that this makes it more natural to project a LockAlgorithm onto an existing Atomic<> field. Sadly, some places have to cast their non-Atomic<> field to Atomic<> in order for this to work. Like so many other things we do, this just shows that the C++ style of labeling fields that are subject to atomic ops as atomic is counterproductive. Maybe some day I'll change LockAlgorithm to use our other Atomics API, which does not require Atomic<>. WTF::Lock now uses LockAlgorithm. The slow paths are still outlined. I don't feel too bad about the LockAlgorithm.h header being included in so many places because we change that algorithm so infrequently. Also, I added a hasElapsed(time) function. This function makes it so much more natural to write timeslicing code, which the concurrent GC has to do a lot of. * WTF.xcodeproj/project.pbxproj: * wtf/CMakeLists.txt: * wtf/ListDump.h: * wtf/Lock.cpp: (WTF::LockBase::lockSlow): (WTF::LockBase::unlockSlow): (WTF::LockBase::unlockFairlySlow): (WTF::LockBase::unlockSlowImpl): Deleted. * wtf/Lock.h: (WTF::LockBase::lock): (WTF::LockBase::tryLock): (WTF::LockBase::unlock): (WTF::LockBase::unlockFairly): (WTF::LockBase::isHeld): (): Deleted. * wtf/LockAlgorithm.h: Added. (WTF::LockAlgorithm::lockFastAssumingZero): (WTF::LockAlgorithm::lockFast): (WTF::LockAlgorithm::lock): (WTF::LockAlgorithm::tryLock): (WTF::LockAlgorithm::unlockFastAssumingZero): (WTF::LockAlgorithm::unlockFast): (WTF::LockAlgorithm::unlock): (WTF::LockAlgorithm::unlockFairly): (WTF::LockAlgorithm::isLocked): (WTF::LockAlgorithm::lockSlow): (WTF::LockAlgorithm::unlockSlow): * wtf/TimeWithDynamicClockType.cpp: (WTF::hasElapsed): * wtf/TimeWithDynamicClockType.h: Canonical link: https://commits.webkit.org/182434@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208720 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-15 01:49:22 +00:00
private:
enum Token {
BargingOpportunity,
DirectHandoff
};
};
} // namespace WTF
using WTF::LockAlgorithm;