2015-08-11 19:51:35 +00:00
/*
Silence leaks in ParkingLot
https://bugs.webkit.org/show_bug.cgi?id=155510
Reviewed by Alexey Proskuryakov.
ParkingLot has a concurrent hashtable that it reallocates on demand. It will not reallocate
it in steady state. The hashtable is sized to accommodate the high watermark of the number
of active threads - so long as the program doesn't just keep starting an unbounded number
of threads that are all active, the hashtable will stop resizing. Each resize operation is
designed to stay out of the way of the data-access-parallel normal path, in which two
threads operating on different lock addresses don't have to synchronize. To do this, it
simply drops the old hashtable without deleting it, so that threads that were still using
it don't crash. They will realize that they have the wrong hashtable before doing anything
bad, but we don't have a way of proving when all of those threads are no longer going to
read from the old hashtables. So, we just leak them.
This is a bounded leak, since the hashtable resizes exponentially. Thus the total memory
utilization of all hashtables, including the leaked ones, converges to a linear function of
the current hashtable's size (it's 2 * size of current hashtable).
But this leak is a problem for leaks tools, which will always report this leak. This is not
useful. It's better to silence the leak. That's what this patch does by ensuring that all
hashtables, including leaked ones, end up in a global vector. This is perf-neutral.
This requires making a StaticWordLock variant of WordLock. That's probably the biggest part
of this change.
* wtf/ParkingLot.cpp:
* wtf/WordLock.cpp:
(WTF::WordLockBase::lockSlow):
(WTF::WordLockBase::unlockSlow):
(WTF::WordLock::lockSlow): Deleted.
(WTF::WordLock::unlockSlow): Deleted.
* wtf/WordLock.h:
(WTF::WordLockBase::lock):
(WTF::WordLockBase::isLocked):
(WTF::WordLock::WordLock):
(WTF::WordLock::lock): Deleted.
(WTF::WordLock::isLocked): Deleted.
Canonical link: https://commits.webkit.org/173707@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@198345 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-03-17 18:54:15 +00:00
* Copyright ( C ) 2015 - 2016 Apple Inc . All rights reserved .
2015-08-11 19:51:35 +00:00
*
* Redistribution and use in source and binary forms , with or without
* modification , are permitted provided that the following conditions
* are met :
* 1. Redistributions of source code must retain the above copyright
* notice , this list of conditions and the following disclaimer .
* 2. Redistributions in binary form must reproduce the above copyright
* notice , this list of conditions and the following disclaimer in the
* documentation and / or other materials provided with the distribution .
*
* THIS SOFTWARE IS PROVIDED BY APPLE INC . ` ` AS IS ' ' AND ANY
* EXPRESS OR IMPLIED WARRANTIES , INCLUDING , BUT NOT LIMITED TO , THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
* PURPOSE ARE DISCLAIMED . IN NO EVENT SHALL APPLE INC . OR
* CONTRIBUTORS BE LIABLE FOR ANY DIRECT , INDIRECT , INCIDENTAL , SPECIAL ,
* EXEMPLARY , OR CONSEQUENTIAL DAMAGES ( INCLUDING , BUT NOT LIMITED TO ,
* PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES ; LOSS OF USE , DATA , OR
* PROFITS ; OR BUSINESS INTERRUPTION ) HOWEVER CAUSED AND ON ANY THEORY
* OF LIABILITY , WHETHER IN CONTRACT , STRICT LIABILITY , OR TORT
* ( INCLUDING NEGLIGENCE OR OTHERWISE ) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE , EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE .
*/
# include "config.h"
2018-10-15 14:24:49 +00:00
# include <wtf/ParkingLot.h>
2015-08-11 19:51:35 +00:00
# include <mutex>
2018-10-15 14:24:49 +00:00
# include <wtf/DataLog.h>
# include <wtf/HashFunctions.h>
# include <wtf/StringPrintStream.h>
# include <wtf/ThreadSpecific.h>
# include <wtf/Threading.h>
# include <wtf/Vector.h>
# include <wtf/WeakRandom.h>
# include <wtf/WordLock.h>
2015-08-11 19:51:35 +00:00
namespace WTF {
namespace {
2020-05-22 02:17:54 +00:00
static constexpr bool verbose = false ;
2015-08-11 19:51:35 +00:00
2016-09-13 15:59:25 +00:00
struct ThreadData : public ThreadSafeRefCounted < ThreadData > {
2015-08-11 19:51:35 +00:00
WTF_MAKE_FAST_ALLOCATED ;
public :
ThreadData ( ) ;
~ ThreadData ( ) ;
[WTF] Introduce Thread class and use RefPtr<Thread> and align Windows Threading implementation semantics to Pthread one
https://bugs.webkit.org/show_bug.cgi?id=170502
Reviewed by Mark Lam.
Source/JavaScriptCore:
* API/tests/CompareAndSwapTest.cpp:
(testCompareAndSwap):
* JavaScriptCore.xcodeproj/project.pbxproj:
* b3/air/testair.cpp:
* b3/testb3.cpp:
(JSC::B3::run):
* bytecode/SuperSampler.cpp:
(JSC::initializeSuperSampler):
* dfg/DFGWorklist.cpp:
* disassembler/Disassembler.cpp:
* heap/Heap.cpp:
(JSC::Heap::lastChanceToFinalize):
(JSC::Heap::notifyIsSafeToCollect):
* heap/Heap.h:
* heap/MachineStackMarker.cpp:
(JSC::MachineThreads::~MachineThreads):
(JSC::MachineThreads::addCurrentThread):
(JSC::MachineThreads::removeThread):
(JSC::MachineThreads::removeThreadIfFound):
(JSC::MachineThreads::MachineThread::MachineThread):
(JSC::MachineThreads::MachineThread::getRegisters):
(JSC::MachineThreads::MachineThread::Registers::stackPointer):
(JSC::MachineThreads::MachineThread::Registers::framePointer):
(JSC::MachineThreads::MachineThread::Registers::instructionPointer):
(JSC::MachineThreads::MachineThread::Registers::llintPC):
(JSC::MachineThreads::MachineThread::captureStack):
(JSC::MachineThreads::tryCopyOtherThreadStack):
(JSC::MachineThreads::tryCopyOtherThreadStacks):
(pthreadSignalHandlerSuspendResume): Deleted.
(JSC::threadData): Deleted.
(JSC::MachineThreads::Thread::Thread): Deleted.
(JSC::MachineThreads::Thread::createForCurrentThread): Deleted.
(JSC::MachineThreads::Thread::operator==): Deleted.
(JSC::MachineThreads::machineThreadForCurrentThread): Deleted.
(JSC::MachineThreads::ThreadData::ThreadData): Deleted.
(JSC::MachineThreads::ThreadData::~ThreadData): Deleted.
(JSC::MachineThreads::ThreadData::suspend): Deleted.
(JSC::MachineThreads::ThreadData::resume): Deleted.
(JSC::MachineThreads::ThreadData::getRegisters): Deleted.
(JSC::MachineThreads::ThreadData::Registers::stackPointer): Deleted.
(JSC::MachineThreads::ThreadData::Registers::framePointer): Deleted.
(JSC::MachineThreads::ThreadData::Registers::instructionPointer): Deleted.
(JSC::MachineThreads::ThreadData::Registers::llintPC): Deleted.
(JSC::MachineThreads::ThreadData::freeRegisters): Deleted.
(JSC::MachineThreads::ThreadData::captureStack): Deleted.
* heap/MachineStackMarker.h:
(JSC::MachineThreads::MachineThread::suspend):
(JSC::MachineThreads::MachineThread::resume):
(JSC::MachineThreads::MachineThread::threadID):
(JSC::MachineThreads::MachineThread::stackBase):
(JSC::MachineThreads::MachineThread::stackEnd):
(JSC::MachineThreads::threadsListHead):
(JSC::MachineThreads::Thread::operator!=): Deleted.
(JSC::MachineThreads::Thread::suspend): Deleted.
(JSC::MachineThreads::Thread::resume): Deleted.
(JSC::MachineThreads::Thread::getRegisters): Deleted.
(JSC::MachineThreads::Thread::freeRegisters): Deleted.
(JSC::MachineThreads::Thread::captureStack): Deleted.
(JSC::MachineThreads::Thread::platformThread): Deleted.
(JSC::MachineThreads::Thread::stackBase): Deleted.
(JSC::MachineThreads::Thread::stackEnd): Deleted.
* jit/ICStats.cpp:
(JSC::ICStats::ICStats):
(JSC::ICStats::~ICStats):
* jit/ICStats.h:
* jsc.cpp:
(functionDollarAgentStart):
(startTimeoutThreadIfNeeded):
* runtime/JSLock.cpp:
(JSC::JSLock::lock):
* runtime/JSLock.h:
(JSC::JSLock::ownerThread):
(JSC::JSLock::currentThreadIsHoldingLock):
* runtime/SamplingProfiler.cpp:
(JSC::FrameWalker::isValidFramePointer):
(JSC::SamplingProfiler::SamplingProfiler):
(JSC::SamplingProfiler::createThreadIfNecessary):
(JSC::SamplingProfiler::takeSample):
* runtime/SamplingProfiler.h:
* runtime/VM.h:
(JSC::VM::ownerThread):
* runtime/VMTraps.cpp:
(JSC::findActiveVMAndStackBounds):
(JSC::VMTraps::SignalSender::send):
(JSC::VMTraps::fireTrap):
Source/WebCore:
Mechanical change. Use Thread:: APIs.
* Modules/indexeddb/server/IDBServer.cpp:
(WebCore::IDBServer::IDBServer::IDBServer):
* Modules/indexeddb/server/IDBServer.h:
* Modules/webaudio/AsyncAudioDecoder.cpp:
(WebCore::AsyncAudioDecoder::AsyncAudioDecoder):
(WebCore::AsyncAudioDecoder::~AsyncAudioDecoder):
(WebCore::AsyncAudioDecoder::runLoop):
* Modules/webaudio/AsyncAudioDecoder.h:
* Modules/webaudio/OfflineAudioDestinationNode.cpp:
(WebCore::OfflineAudioDestinationNode::OfflineAudioDestinationNode):
(WebCore::OfflineAudioDestinationNode::uninitialize):
(WebCore::OfflineAudioDestinationNode::startRendering):
* Modules/webaudio/OfflineAudioDestinationNode.h:
* Modules/webdatabase/Database.cpp:
(WebCore::Database::securityOrigin):
* Modules/webdatabase/DatabaseThread.cpp:
(WebCore::DatabaseThread::start):
(WebCore::DatabaseThread::databaseThread):
(WebCore::DatabaseThread::recordDatabaseOpen):
(WebCore::DatabaseThread::recordDatabaseClosed):
* Modules/webdatabase/DatabaseThread.h:
(WebCore::DatabaseThread::getThreadID):
* bindings/js/GCController.cpp:
(WebCore::GCController::garbageCollectOnAlternateThreadForDebugging):
* fileapi/AsyncFileStream.cpp:
(WebCore::callOnFileThread):
* loader/icon/IconDatabase.cpp:
(WebCore::IconDatabase::open):
(WebCore::IconDatabase::close):
* loader/icon/IconDatabase.h:
* page/ResourceUsageThread.cpp:
(WebCore::ResourceUsageThread::createThreadIfNeeded):
* page/ResourceUsageThread.h:
* page/scrolling/ScrollingThread.cpp:
(WebCore::ScrollingThread::ScrollingThread):
(WebCore::ScrollingThread::isCurrentThread):
(WebCore::ScrollingThread::createThreadIfNeeded):
(WebCore::ScrollingThread::threadCallback):
* page/scrolling/ScrollingThread.h:
* platform/audio/HRTFDatabaseLoader.cpp:
(WebCore::HRTFDatabaseLoader::HRTFDatabaseLoader):
(WebCore::HRTFDatabaseLoader::loadAsynchronously):
(WebCore::HRTFDatabaseLoader::waitForLoaderThreadCompletion):
* platform/audio/HRTFDatabaseLoader.h:
* platform/audio/ReverbConvolver.cpp:
(WebCore::ReverbConvolver::ReverbConvolver):
(WebCore::ReverbConvolver::~ReverbConvolver):
* platform/audio/ReverbConvolver.h:
* platform/audio/gstreamer/AudioFileReaderGStreamer.cpp:
(WebCore::createBusFromAudioFile):
(WebCore::createBusFromInMemoryAudioFile):
* platform/graphics/gstreamer/WebKitWebSourceGStreamer.cpp:
(ResourceHandleStreamingClient::ResourceHandleStreamingClient):
(ResourceHandleStreamingClient::~ResourceHandleStreamingClient):
* platform/network/cf/LoaderRunLoopCF.cpp:
(WebCore::loaderRunLoop):
* platform/network/curl/CurlDownload.cpp:
(WebCore::CurlDownloadManager::startThreadIfNeeded):
(WebCore::CurlDownloadManager::stopThread):
* platform/network/curl/CurlDownload.h:
* platform/network/curl/SocketStreamHandleImpl.h:
* platform/network/curl/SocketStreamHandleImplCurl.cpp:
(WebCore::SocketStreamHandleImpl::startThread):
(WebCore::SocketStreamHandleImpl::stopThread):
* workers/WorkerThread.cpp:
(WebCore::WorkerThread::WorkerThread):
(WebCore::WorkerThread::start):
(WebCore::WorkerThread::workerThread):
* workers/WorkerThread.h:
(WebCore::WorkerThread::threadID):
Source/WebKit:
Mechanical change. Use Thread:: APIs.
* Storage/StorageThread.cpp:
(WebCore::StorageThread::StorageThread):
(WebCore::StorageThread::~StorageThread):
(WebCore::StorageThread::start):
(WebCore::StorageThread::dispatch):
(WebCore::StorageThread::terminate):
* Storage/StorageThread.h:
Source/WebKit2:
Mechanical change. Use Thread:: APIs.
* NetworkProcess/NetworkProcess.cpp:
(WebKit::NetworkProcess::initializeNetworkProcess):
* NetworkProcess/cache/NetworkCacheIOChannelSoup.cpp:
(WebKit::NetworkCache::IOChannel::readSyncInThread):
* Platform/IPC/Connection.cpp:
(IPC::Connection::processIncomingMessage):
* Shared/EntryPointUtilities/mac/XPCService/XPCServiceEntryPoint.h:
(WebKit::XPCServiceInitializer):
* UIProcess/linux/MemoryPressureMonitor.cpp:
(WebKit::MemoryPressureMonitor::MemoryPressureMonitor):
* WebProcess/WebProcess.cpp:
(WebKit::WebProcess::initializeWebProcess):
Source/WTF:
This patch is refactoring of WTF Threading mechanism to merge JavaScriptCore's threading extension to WTF Threading.
Previously, JavaScriptCore requires richer threading features (such as suspending and resuming threads), and they
are implemented for PlatformThread in JavaScriptCore. But these features should be implemented in WTF Threading side
instead of maintaining JSC's own threading features too. This patch removes these features from JSC and move it to
WTF Threading.
However, current WTF Threading has one problem: Windows version of WTF Threading has different semantics from Pthreads
one. In Windows WTF Threading, we cannot perform any operation after the target thread is detached: WTF Threading stop
tracking the state of the thread once the thread is detached. But this is not the same to Pthreads one. In Pthreads,
pthread_detach just means that the resource of the thread will be destroyed automatically. While some operations like
pthread_join will be rejected, some operations like pthread_kill will be accepted.
The problem is that detached thread can be suspended and resumed in JSC. For example, in jsc.cpp, we start the worker
thread and detach it immediately. In worker thread, we will create VM and thus concurrent GC will suspend and resume
the detached thread. However, in Windows WTF Threading, we have no reference to the detached thread. Thus we cannot
perform suspend and resume operations onto the detached thread.
To solve the problem, we change Windows Threading mechanism drastically to align it to the Pthread semantics. In the
new Threading, we have RefPtr<Thread> class. It holds a handle to a platform thread. We can perform threading operations
with this class. For example, Thread::suspend is offered. And we use destructor of the thread local variable to release
the resources held by RefPtr<Thread>. In Windows, Thread::detach does nothing because the resource will be destroyed
automatically by RefPtr<Thread>.
To do so, we introduce ThreadHolder for Windows. This is similar to the previous ThreadIdentifierData for Pthreads.
It holds RefPtr<Thread> in the thread local storage (technically, it is Fiber Local Storage in Windows). Thread::current()
will return this reference.
The problematic situation is that the order of the deallocation of the thread local storage is not defined. So we should
not touch thread local storage in the destructor of the thread local storage. To avoid such edge cases, we have
currentThread() / Thread::currentID() APIs. They are safe to be called even in the destructor of the other thread local
storage. And in Windows, in the FLS destructor, we will create the thread_local variable to defer the destruction of
the ThreadHolder. We ensure that this destructor is called after the other FLS destructors are called in Windows 10.
This patch is performance neutral.
* WTF.xcodeproj/project.pbxproj:
* benchmarks/ConditionSpeedTest.cpp:
* benchmarks/LockFairnessTest.cpp:
* benchmarks/LockSpeedTest.cpp:
* wtf/AutomaticThread.cpp:
(WTF::AutomaticThread::start):
* wtf/CMakeLists.txt:
* wtf/MainThread.h:
* wtf/MemoryPressureHandler.h:
* wtf/ParallelJobsGeneric.cpp:
(WTF::ParallelEnvironment::ThreadPrivate::tryLockFor):
(WTF::ParallelEnvironment::ThreadPrivate::workerThread):
* wtf/ParallelJobsGeneric.h:
(WTF::ParallelEnvironment::ThreadPrivate::ThreadPrivate): Deleted.
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::forEachImpl):
* wtf/ParkingLot.h:
(WTF::ParkingLot::forEach):
* wtf/PlatformRegisters.h: Renamed from Source/JavaScriptCore/runtime/PlatformThread.h.
* wtf/RefPtr.h:
(WTF::RefPtr::RefPtr):
* wtf/ThreadFunctionInvocation.h:
(WTF::ThreadFunctionInvocation::ThreadFunctionInvocation):
* wtf/ThreadHolder.cpp: Added.
(WTF::ThreadHolder::~ThreadHolder):
(WTF::ThreadHolder::initialize):
* wtf/ThreadHolder.h: Renamed from Source/WTF/wtf/ThreadIdentifierDataPthreads.h.
(WTF::ThreadHolder::thread):
(WTF::ThreadHolder::ThreadHolder):
* wtf/ThreadHolderPthreads.cpp: Renamed from Source/WTF/wtf/ThreadIdentifierDataPthreads.cpp.
(WTF::ThreadHolder::initializeOnce):
(WTF::ThreadHolder::current):
(WTF::ThreadHolder::destruct):
* wtf/ThreadHolderWin.cpp: Added.
(WTF::threadMapMutex):
(WTF::threadMap):
(WTF::ThreadHolder::initializeOnce):
(WTF::ThreadHolder::current):
(WTF::ThreadHolder::destruct):
* wtf/ThreadSpecific.h:
* wtf/Threading.cpp:
(WTF::Thread::normalizeThreadName):
(WTF::threadEntryPoint):
(WTF::Thread::create):
(WTF::Thread::setCurrentThreadIsUserInteractive):
(WTF::Thread::setCurrentThreadIsUserInitiated):
(WTF::Thread::setGlobalMaxQOSClass):
(WTF::Thread::adjustedQOSClass):
(WTF::Thread::dump):
(WTF::initializeThreading):
(WTF::normalizeThreadName): Deleted.
(WTF::createThread): Deleted.
(WTF::setCurrentThreadIsUserInteractive): Deleted.
(WTF::setCurrentThreadIsUserInitiated): Deleted.
(WTF::setGlobalMaxQOSClass): Deleted.
(WTF::adjustedQOSClass): Deleted.
* wtf/Threading.h:
(WTF::Thread::id):
(WTF::Thread::operator==):
(WTF::Thread::operator!=):
(WTF::Thread::joinableState):
(WTF::Thread::didBecomeDetached):
(WTF::Thread::didJoin):
(WTF::Thread::hasExited):
(WTF::currentThread):
* wtf/ThreadingPthreads.cpp:
(WTF::Thread::Thread):
(WTF::Thread::~Thread):
(WTF::Thread::signalHandlerSuspendResume):
(WTF::Thread::initializePlatformThreading):
(WTF::initializeCurrentThreadEvenIfNonWTFCreated):
(WTF::wtfThreadEntryPoint):
(WTF::Thread::createInternal):
(WTF::Thread::initializeCurrentThreadInternal):
(WTF::Thread::changePriority):
(WTF::Thread::waitForCompletion):
(WTF::Thread::detach):
(WTF::Thread::current):
(WTF::Thread::currentID):
(WTF::Thread::signal):
(WTF::Thread::resume):
(WTF::Thread::getRegisters):
(WTF::Thread::didExit):
(WTF::Thread::establish):
(WTF::PthreadState::PthreadState): Deleted.
(WTF::PthreadState::joinableState): Deleted.
(WTF::PthreadState::pthreadHandle): Deleted.
(WTF::PthreadState::didBecomeDetached): Deleted.
(WTF::PthreadState::didExit): Deleted.
(WTF::PthreadState::didJoin): Deleted.
(WTF::PthreadState::hasExited): Deleted.
(WTF::threadMapMutex): Deleted.
(WTF::initializeThreading): Deleted.
(WTF::threadMap): Deleted.
(WTF::identifierByPthreadHandle): Deleted.
(WTF::establishIdentifierForPthreadHandle): Deleted.
(WTF::pthreadHandleForIdentifierWithLockAlreadyHeld): Deleted.
(WTF::createThreadInternal): Deleted.
(WTF::initializeCurrentThreadInternal): Deleted.
(WTF::changeThreadPriority): Deleted.
(WTF::waitForThreadCompletion): Deleted.
(WTF::detachThread): Deleted.
(WTF::threadDidExit): Deleted.
(WTF::currentThread): Deleted.
(WTF::signalThread): Deleted.
* wtf/ThreadingWin.cpp:
(WTF::Thread::Thread):
(WTF::Thread::~Thread):
(WTF::Thread::initializeCurrentThreadInternal):
(WTF::Thread::initializePlatformThreading):
(WTF::wtfThreadEntryPoint):
(WTF::Thread::createInternal):
(WTF::Thread::changePriority):
(WTF::Thread::waitForCompletion):
(WTF::Thread::detach):
(WTF::Thread::resume):
(WTF::Thread::getRegisters):
(WTF::Thread::current):
(WTF::Thread::currentID):
(WTF::Thread::didExit):
(WTF::Thread::establish):
(WTF::initializeCurrentThreadInternal): Deleted.
(WTF::threadMapMutex): Deleted.
(WTF::initializeThreading): Deleted.
(WTF::threadMap): Deleted.
(WTF::storeThreadHandleByIdentifier): Deleted.
(WTF::threadHandleForIdentifier): Deleted.
(WTF::clearThreadHandleForIdentifier): Deleted.
(WTF::createThreadInternal): Deleted.
(WTF::changeThreadPriority): Deleted.
(WTF::waitForThreadCompletion): Deleted.
(WTF::detachThread): Deleted.
(WTF::currentThread): Deleted.
* wtf/WorkQueue.cpp:
(WTF::WorkQueue::concurrentApply):
* wtf/WorkQueue.h:
* wtf/cocoa/WorkQueueCocoa.cpp:
(WTF::dispatchQOSClass):
* wtf/generic/WorkQueueGeneric.cpp:
(WorkQueue::platformInitialize):
(WorkQueue::platformInvalidate):
* wtf/linux/MemoryPressureHandlerLinux.cpp:
(WTF::MemoryPressureHandler::EventFDPoller::EventFDPoller):
(WTF::MemoryPressureHandler::EventFDPoller::~EventFDPoller):
* wtf/win/MainThreadWin.cpp:
(WTF::initializeMainThreadPlatform):
Tools:
Mechanical change. Use Thread:: APIs.
* DumpRenderTree/JavaScriptThreading.cpp:
(runJavaScriptThread):
(startJavaScriptThreads):
(stopJavaScriptThreads):
* DumpRenderTree/mac/DumpRenderTree.mm:
(testThreadIdentifierMap):
* TestWebKitAPI/Tests/WTF/Condition.cpp:
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::runLockTest):
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/187690@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@215265 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-04-12 12:08:29 +00:00
Ref < Thread > thread ;
2015-08-11 19:51:35 +00:00
WTF::ParkingLot should stop using std::chrono because std::chrono::duration casts are prone to overflows
https://bugs.webkit.org/show_bug.cgi?id=152045
Reviewed by Andy Estes.
Source/JavaScriptCore:
Probably the nicest example of why this patch is a good idea is the change in
AtomicsObject.cpp.
* jit/ICStats.cpp:
(JSC::ICStats::ICStats):
* runtime/AtomicsObject.cpp:
(JSC::atomicsFuncWait):
Source/WebCore:
No new layout tests because no new behavior. The new WTF time classes have some unit tests
in TestWebKitAPI.
* fileapi/ThreadableBlobRegistry.cpp:
(WebCore::ThreadableBlobRegistry::blobSize):
* platform/MainThreadSharedTimer.h:
* platform/SharedTimer.h:
* platform/ThreadTimers.cpp:
(WebCore::ThreadTimers::updateSharedTimer):
* platform/cf/MainThreadSharedTimerCF.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/efl/MainThreadSharedTimerEfl.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/glib/MainThreadSharedTimerGLib.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/win/MainThreadSharedTimerWin.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* workers/WorkerRunLoop.cpp:
(WebCore::WorkerRunLoop::runInMode):
Source/WebKit2:
* Platform/IPC/Connection.cpp:
(IPC::Connection::SyncMessageState::wait):
(IPC::Connection::sendMessage):
(IPC::Connection::timeoutRespectingIgnoreTimeoutsForTesting):
(IPC::Connection::waitForMessage):
(IPC::Connection::sendSyncMessage):
(IPC::Connection::waitForSyncReply):
* Platform/IPC/Connection.h:
(IPC::Connection::sendSync):
(IPC::Connection::waitForAndDispatchImmediately):
* Platform/IPC/MessageSender.h:
(IPC::MessageSender::sendSync):
* UIProcess/ChildProcessProxy.h:
(WebKit::ChildProcessProxy::sendSync):
* UIProcess/Network/NetworkProcessProxy.cpp:
(WebKit::NetworkProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/Storage/StorageManager.cpp:
(WebKit::StorageManager::applicationWillTerminate):
* UIProcess/WebProcessProxy.cpp:
(WebKit::WebProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/WebResourceLoadStatisticsStore.cpp:
(WebKit::WebResourceLoadStatisticsStore::applicationWillTerminate):
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.h:
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.mm:
(-[WKOneShotDisplayLinkHandler displayLinkFired:]):
(WebKit::RemoteLayerTreeDrawingAreaProxy::commitLayerTree):
(WebKit::RemoteLayerTreeDrawingAreaProxy::didRefreshDisplay):
(WebKit::RemoteLayerTreeDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/TiledCoreAnimationDrawingAreaProxy.mm:
(WebKit::TiledCoreAnimationDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/WKImmediateActionController.mm:
(-[WKImmediateActionController immediateActionRecognizerWillBeginAnimation:]):
* UIProcess/mac/WebPageProxyMac.mm:
(WebKit::WebPageProxy::stringSelectionForPasteboard):
(WebKit::WebPageProxy::dataSelectionForPasteboard):
(WebKit::WebPageProxy::readSelectionFromPasteboard):
(WebKit::WebPageProxy::shouldDelayWindowOrderingForEvent):
(WebKit::WebPageProxy::acceptsFirstMouse):
* WebProcess/WebCoreSupport/WebChromeClient.cpp:
(WebKit::WebChromeClient::runBeforeUnloadConfirmPanel):
(WebKit::WebChromeClient::runJavaScriptAlert):
(WebKit::WebChromeClient::runJavaScriptConfirm):
(WebKit::WebChromeClient::runJavaScriptPrompt):
(WebKit::WebChromeClient::print):
(WebKit::WebChromeClient::exceededDatabaseQuota):
(WebKit::WebChromeClient::reachedApplicationCacheOriginQuota):
* WebProcess/WebCoreSupport/WebFrameLoaderClient.cpp:
(WebKit::WebFrameLoaderClient::dispatchDecidePolicyForResponse):
* WebProcess/WebPage/WebPage.cpp:
(WebKit::WebPage::postSynchronousMessageForTesting):
Source/WTF:
We used to use 'double' for all time measurements. Sometimes it was milliseconds,
sometimes it was seconds. Sometimes we measured a span of time, sometimes we spoke of time
since some epoch. When we spoke of time since epoch, we either used a monotonic clock or
a wall clock. The type - always 'double' - never told us what kind of time we had, even
though there were roughly six of them (sec interval, ms interval, sec since epoch on wall,
ms since epoch on wall, sec since epoch monotonic, ms since epoch monotonic).
At some point, we thought that it would be a good idea to replace these doubles with
std::chrono. But since replacing some things with std::chrono, we found it to be terribly
inconvenient:
- Outrageous API. I never want to say std::chrono::milliseconds(blah). I never want to say
std::chrono::steady_clock::timepoint. The syntax for duration_cast is ugly, and ideally
duration_cast would not even be a thing.
- No overflow protection. std::chrono uses integers by default and using anything else is
clumsy. But the integer math is done without regard for the rough edges of integer math,
so any cast between std::chrono types risks overflow. Any comparison risks overflow
because it may do conversions silently. We have even found bugs where some C++
implementations had more overflows than others, which ends up being a special kind of
hell. In many cases, the overflow also has nasal demons.
It's an error to represent time using integers. It would have been excusable back when
floating point math was not guaranteed to be supported on all platforms, but that would
have been a long time ago. Time is a continuous, infinite concept and it's a perfect fit
for floating point:
- Floating point preserves precision under multiplication in all but extreme cases, so
using floating point for time means that unit conversions are almost completely
lossless. This means that we don't have to think very hard about what units to use. In
this patch, we use seconds almost everywhere. We only convert at boundaries, like an API
boundary that wants something other than seconds.
- Floating point makes it easy to reason about infinity, which is something that time code
wants to do a lot. Example: when would you like to timeout? Infinity please! This is the
most elegant way of having an API support both a timeout variant and a no-timeout
variant.
- Floating point does well-understood things when math goes wrong, and these things are
pretty well optimized to match what a mathematician would do when computing with real
numbers represented using scientific notation with a finite number of significant
digits. This means that time math under floating point looks like normal math. On the
other hand, std::chrono time math looks like garbage because you have to always check
for multiple possible UB corners whenever you touch large integers. Integers that
represent time are very likely to be large and you don't have to do much to overflow
them. At this time, based on the number of bugs we have already seen due to chrono
overflows, I am not certain that we even understand what are all of the corner cases
that we should even check for.
This patch introduces a new set of timekeeping classes that are all based on double, and
all internally use seconds. These classes support algebraic typing. The classes are:
- Seconds: this is for measuring a duration.
- WallTime: time since epoch according to a wall clock (aka real time clock).
- MonotonicTime: time since epoch according to a monotonic clock.
- ClockType: enum that says either Wall or Monotonic.
- TimeWithDynamicClockType: a tuple of double and ClockType, which represents either a
wall time or a monotonic time.
All of these classes behave like C++ values and are cheap to copy around since they are
very nearly POD. This supports comprehensive conversions between the various time types.
Most of this is by way of algebra. Here are just some of the rules we recognize:
WallTime = WallTime + Seconds
Seconds = WallTime - WallTime
MonotonicTime = MonotonicTime + Seconds
etc...
We support negative, infinite, and NaN times because math.
We support conversions between MonotonicTime and WallTime, like:
WallTime wt = mt.approximateWallTime()
This is called this "approximate" because the only way to do it is to get the current time
on both clocks and convert relative to that.
Many of our APIs would be happy using whatever notion of time the user wanted to use. For
those APIs, which includes Condition and ParkingLot, we have TimeWithDynamicClockType. You
can automatically convert WallTime or MonotonicTime to TimeWithDynamicClockType. This
means that if you use a WallTime with Condition::waitUntil, then Condition's internal
logic for when it should wake up makes its decision based on the current WallTime - but if
you use MonotonicTime then waitUntil will make its decision based on current
MonotonicTime. This is a greater level of flexibility than chrono allowed, since chrono
did not have the concept of a dynamic clock type.
This patch does not include conversions between std::chrono and these new time classes,
because past experience shows that we're quite bad at getting conversions between
std::chrono and anything else right. Also, I didn't need such conversion code because this
patch only converts code that transitively touches ParkingLot and Condition. It was easy
to get all of that code onto the new time classes.
* WTF.xcodeproj/project.pbxproj:
* wtf/AutomaticThread.cpp:
(WTF::AutomaticThread::start):
* wtf/CMakeLists.txt:
* wtf/ClockType.cpp: Added.
(WTF::printInternal):
* wtf/ClockType.h: Added.
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::waitFor):
(WTF::ConditionBase::wait):
(WTF::ConditionBase::waitUntilWallClockSeconds): Deleted.
(WTF::ConditionBase::waitUntilMonotonicClockSeconds): Deleted.
(WTF::ConditionBase::waitForSeconds): Deleted.
(WTF::ConditionBase::waitForSecondsImpl): Deleted.
(WTF::ConditionBase::waitForImpl): Deleted.
(WTF::ConditionBase::absoluteFromRelative): Deleted.
* wtf/CrossThreadQueue.h:
(WTF::CrossThreadQueue<DataType>::waitForMessage):
* wtf/CurrentTime.cpp:
(WTF::sleep):
* wtf/MessageQueue.h:
(WTF::MessageQueue::infiniteTime): Deleted.
* wtf/MonotonicTime.cpp: Added.
(WTF::MonotonicTime::now):
(WTF::MonotonicTime::approximateWallTime):
(WTF::MonotonicTime::dump):
(WTF::MonotonicTime::sleep):
* wtf/MonotonicTime.h: Added.
(WTF::MonotonicTime::MonotonicTime):
(WTF::MonotonicTime::fromRawDouble):
(WTF::MonotonicTime::infinity):
(WTF::MonotonicTime::secondsSinceEpoch):
(WTF::MonotonicTime::approximateMonotonicTime):
(WTF::MonotonicTime::operator bool):
(WTF::MonotonicTime::operator+):
(WTF::MonotonicTime::operator-):
(WTF::MonotonicTime::operator+=):
(WTF::MonotonicTime::operator-=):
(WTF::MonotonicTime::operator==):
(WTF::MonotonicTime::operator!=):
(WTF::MonotonicTime::operator<):
(WTF::MonotonicTime::operator>):
(WTF::MonotonicTime::operator<=):
(WTF::MonotonicTime::operator>=):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkCount):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
* wtf/Seconds.cpp: Added.
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::dump):
(WTF::Seconds::sleep):
* wtf/Seconds.h: Added.
(WTF::Seconds::Seconds):
(WTF::Seconds::value):
(WTF::Seconds::seconds):
(WTF::Seconds::milliseconds):
(WTF::Seconds::microseconds):
(WTF::Seconds::nanoseconds):
(WTF::Seconds::fromMilliseconds):
(WTF::Seconds::fromMicroseconds):
(WTF::Seconds::fromNanoseconds):
(WTF::Seconds::infinity):
(WTF::Seconds::operator bool):
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::operator*):
(WTF::Seconds::operator/):
(WTF::Seconds::operator+=):
(WTF::Seconds::operator-=):
(WTF::Seconds::operator*=):
(WTF::Seconds::operator/=):
(WTF::Seconds::operator==):
(WTF::Seconds::operator!=):
(WTF::Seconds::operator<):
(WTF::Seconds::operator>):
(WTF::Seconds::operator<=):
(WTF::Seconds::operator>=):
* wtf/TimeWithDynamicClockType.cpp: Added.
(WTF::TimeWithDynamicClockType::now):
(WTF::TimeWithDynamicClockType::nowWithSameClock):
(WTF::TimeWithDynamicClockType::wallTime):
(WTF::TimeWithDynamicClockType::monotonicTime):
(WTF::TimeWithDynamicClockType::approximateWallTime):
(WTF::TimeWithDynamicClockType::approximateMonotonicTime):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator<):
(WTF::TimeWithDynamicClockType::operator>):
(WTF::TimeWithDynamicClockType::operator<=):
(WTF::TimeWithDynamicClockType::operator>=):
(WTF::TimeWithDynamicClockType::dump):
(WTF::TimeWithDynamicClockType::sleep):
* wtf/TimeWithDynamicClockType.h: Added.
(WTF::TimeWithDynamicClockType::TimeWithDynamicClockType):
(WTF::TimeWithDynamicClockType::fromRawDouble):
(WTF::TimeWithDynamicClockType::secondsSinceEpoch):
(WTF::TimeWithDynamicClockType::clockType):
(WTF::TimeWithDynamicClockType::withSameClockAndRawDouble):
(WTF::TimeWithDynamicClockType::operator bool):
(WTF::TimeWithDynamicClockType::operator+):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator+=):
(WTF::TimeWithDynamicClockType::operator-=):
(WTF::TimeWithDynamicClockType::operator==):
(WTF::TimeWithDynamicClockType::operator!=):
* wtf/WallTime.cpp: Added.
(WTF::WallTime::now):
(WTF::WallTime::approximateMonotonicTime):
(WTF::WallTime::dump):
(WTF::WallTime::sleep):
* wtf/WallTime.h: Added.
(WTF::WallTime::WallTime):
(WTF::WallTime::fromRawDouble):
(WTF::WallTime::infinity):
(WTF::WallTime::secondsSinceEpoch):
(WTF::WallTime::approximateWallTime):
(WTF::WallTime::operator bool):
(WTF::WallTime::operator+):
(WTF::WallTime::operator-):
(WTF::WallTime::operator+=):
(WTF::WallTime::operator-=):
(WTF::WallTime::operator==):
(WTF::WallTime::operator!=):
(WTF::WallTime::operator<):
(WTF::WallTime::operator>):
(WTF::WallTime::operator<=):
(WTF::WallTime::operator>=):
* wtf/threads/BinarySemaphore.cpp:
(WTF::BinarySemaphore::wait):
* wtf/threads/BinarySemaphore.h:
Tools:
* TestWebKitAPI/CMakeLists.txt:
* TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:
* TestWebKitAPI/Tests/WTF/Condition.cpp:
(TestWebKitAPI::TEST):
* TestWebKitAPI/Tests/WTF/SynchronizedFixedQueue.cpp:
(TestWebKitAPI::ToUpperConverter::stopProducing):
(TestWebKitAPI::ToUpperConverter::stopConsuming):
* TestWebKitAPI/Tests/WTF/Time.cpp: Added.
(WTF::operator<<):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/182152@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208415 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-05 03:02:39 +00:00
Mutex parkingLock ;
ThreadCondition parkingCondition ;
2015-08-11 19:51:35 +00:00
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
const void * address { nullptr } ;
2015-08-11 19:51:35 +00:00
ThreadData * nextInQueue { nullptr } ;
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
intptr_t token { 0 } ;
2015-08-11 19:51:35 +00:00
} ;
enum class DequeueResult {
Ignore ,
RemoveAndContinue ,
RemoveAndStop
} ;
struct Bucket {
WTF_MAKE_FAST_ALLOCATED ;
public :
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
Bucket ( )
: random ( static_cast < unsigned > ( bitwise_cast < intptr_t > ( this ) ) ) // Cannot use default seed since that recurses into Lock.
{
}
2015-08-11 19:51:35 +00:00
void enqueue ( ThreadData * data )
{
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : enqueueing " , RawPointer ( data ) , " with address = " , RawPointer ( data - > address ) , " onto " , RawPointer ( this ) , " \n " ) ) ;
2015-08-11 19:51:35 +00:00
ASSERT ( data - > address ) ;
ASSERT ( ! data - > nextInQueue ) ;
if ( queueTail ) {
queueTail - > nextInQueue = data ;
queueTail = data ;
return ;
}
queueHead = data ;
queueTail = data ;
}
template < typename Functor >
void genericDequeue ( const Functor & functor )
{
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : dequeueing from bucket at " , RawPointer ( this ) , " \n " ) ) ;
2015-08-11 19:51:35 +00:00
if ( ! queueHead ) {
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : empty. \n " ) ) ;
2015-08-11 19:51:35 +00:00
return ;
}
// This loop is a very clever abomination. The induction variables are the pointer to the
// pointer to the current node, and the pointer to the previous node. This gives us everything
// we need to both proceed forward to the next node, and to remove nodes while maintaining the
// queueHead/queueTail and all of the nextInQueue links. For example, when we are at the head
// element, then removal means rewiring queueHead, and if it was also equal to queueTail, then
// we'd want queueTail to be set to nullptr. This works because:
//
// currentPtr == &queueHead
// previous == nullptr
//
// We remove by setting *currentPtr = (*currentPtr)->nextInQueue, i.e. changing the pointer
// that used to point to this node to instead point to this node's successor. Another example:
// if we were at the second node in the queue, then we'd have:
//
// currentPtr == &queueHead->nextInQueue
// previous == queueHead
//
// If this node is not equal to queueTail, then removing it simply means making
// queueHead->nextInQueue point to queueHead->nextInQueue->nextInQueue (which the algorithm
// achieves by mutating *currentPtr). If this node is equal to queueTail, then we want to set
// queueTail to previous, which in this case is queueHead - thus making the queue look like a
// proper one-element queue with queueHead == queueTail.
bool shouldContinue = true ;
ThreadData * * currentPtr = & queueHead ;
ThreadData * previous = nullptr ;
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
2018-03-02 17:13:32 +00:00
MonotonicTime time = MonotonicTime : : now ( ) ;
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
bool timeToBeFair = false ;
if ( time > nextFairTime )
timeToBeFair = true ;
bool didDequeue = false ;
2015-08-11 19:51:35 +00:00
while ( shouldContinue ) {
ThreadData * current = * currentPtr ;
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : got thread " , RawPointer ( current ) , " \n " ) ) ;
2015-08-11 19:51:35 +00:00
if ( ! current )
break ;
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
DequeueResult result = functor ( current , timeToBeFair ) ;
2015-08-11 19:51:35 +00:00
switch ( result ) {
case DequeueResult : : Ignore :
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : currentPtr = " , RawPointer ( currentPtr ) , " , *currentPtr = " , RawPointer ( * currentPtr ) , " \n " ) ) ;
2015-08-11 19:51:35 +00:00
previous = current ;
currentPtr = & ( * currentPtr ) - > nextInQueue ;
break ;
case DequeueResult : : RemoveAndStop :
2016-07-20 20:00:04 +00:00
shouldContinue = false ;
FALLTHROUGH ;
case DequeueResult : : RemoveAndContinue :
2015-08-11 19:51:35 +00:00
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : dequeueing " , RawPointer ( current ) , " from " , RawPointer ( this ) , " \n " ) ) ;
2015-08-11 19:51:35 +00:00
if ( current = = queueTail )
queueTail = previous ;
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
didDequeue = true ;
2015-08-11 19:51:35 +00:00
* currentPtr = current - > nextInQueue ;
current - > nextInQueue = nullptr ;
break ;
}
}
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
if ( timeToBeFair & & didDequeue )
2018-03-02 17:13:32 +00:00
nextFairTime = time + Seconds : : fromMilliseconds ( random . get ( ) ) ;
2015-08-11 19:51:35 +00:00
ASSERT ( ! ! queueHead = = ! ! queueTail ) ;
}
ThreadData * dequeue ( )
{
ThreadData * result = nullptr ;
genericDequeue (
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
[ & ] ( ThreadData * element , bool ) - > DequeueResult {
2015-08-11 19:51:35 +00:00
result = element ;
return DequeueResult : : RemoveAndStop ;
} ) ;
return result ;
}
ThreadData * queueHead { nullptr } ;
ThreadData * queueTail { nullptr } ;
// This lock protects the entire bucket. Thou shall not make changes to Bucket without holding
// this lock.
Always use a byte-sized lock implementation
https://bugs.webkit.org/show_bug.cgi?id=147908
Reviewed by Geoffrey Garen.
Source/JavaScriptCore:
* runtime/ConcurrentJITLock.h: Lock is now byte-sized and ByteLock is gone, so use Lock.
Source/WTF:
At the start of my locking algorithm crusade, I implemented Lock, which is a sizeof(void*)
lock implementation with some nice theoretical properties and good performance. Then I added
the ParkingLot abstraction and ByteLock. ParkingLot uses Lock in its implementation.
ByteLock uses ParkingLot to create a sizeof(char) lock implementation that performs like
Lock.
It turns out that ByteLock is always at least as good as Lock, and sometimes a lot better:
it requires 8x less memory on 64-bit systems. It's hard to construct a benchmark where
ByteLock is significantly slower than Lock, and when you do construct such a benchmark,
tweaking it a bit can also create a scenario where ByteLock is significantly faster than
Lock.
So, the thing that we call "Lock" should really use ByteLock's algorithm, since it is more
compact and just as fast. That's what this patch does.
But we still need to keep the old Lock algorithm, because it's used to implement ParkingLot,
which in turn is used to implement ByteLock. So this patch does this transformation:
- Move the algorithm in Lock into files called WordLock.h|cpp. Make ParkingLot use
WordLock.
- Move the algorithm in ByteLock into Lock.h|cpp. Make everyone who used ByteLock use Lock
instead. All other users of Lock now get the byte-sized lock implementation.
- Remove the old ByteLock files.
* WTF.vcxproj/WTF.vcxproj:
* WTF.xcodeproj/project.pbxproj:
* benchmarks/LockSpeedTest.cpp:
(main):
* wtf/WordLock.cpp: Added.
(WTF::WordLock::lockSlow):
(WTF::WordLock::unlockSlow):
* wtf/WordLock.h: Added.
(WTF::WordLock::WordLock):
(WTF::WordLock::lock):
(WTF::WordLock::unlock):
(WTF::WordLock::isHeld):
(WTF::WordLock::isLocked):
* wtf/ByteLock.cpp: Removed.
* wtf/ByteLock.h: Removed.
* wtf/CMakeLists.txt:
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::lock):
(WTF::LockBase::unlock):
(WTF::LockBase::isHeld):
(WTF::LockBase::isLocked):
(WTF::Lock::Lock):
* wtf/ParkingLot.cpp:
Tools:
All previous tests of Lock are now tests of WordLock. All previous tests of ByteLock are
now tests of Lock.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166025@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188323 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-12 04:20:24 +00:00
WordLock lock ;
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
2018-03-02 17:13:32 +00:00
MonotonicTime nextFairTime ;
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
WeakRandom random ;
2015-08-11 19:51:35 +00:00
// Put some distane between buckets in memory. This is one of several mitigations against false
// sharing.
char padding [ 64 ] ;
} ;
Silence leaks in ParkingLot
https://bugs.webkit.org/show_bug.cgi?id=155510
Reviewed by Alexey Proskuryakov.
ParkingLot has a concurrent hashtable that it reallocates on demand. It will not reallocate
it in steady state. The hashtable is sized to accommodate the high watermark of the number
of active threads - so long as the program doesn't just keep starting an unbounded number
of threads that are all active, the hashtable will stop resizing. Each resize operation is
designed to stay out of the way of the data-access-parallel normal path, in which two
threads operating on different lock addresses don't have to synchronize. To do this, it
simply drops the old hashtable without deleting it, so that threads that were still using
it don't crash. They will realize that they have the wrong hashtable before doing anything
bad, but we don't have a way of proving when all of those threads are no longer going to
read from the old hashtables. So, we just leak them.
This is a bounded leak, since the hashtable resizes exponentially. Thus the total memory
utilization of all hashtables, including the leaked ones, converges to a linear function of
the current hashtable's size (it's 2 * size of current hashtable).
But this leak is a problem for leaks tools, which will always report this leak. This is not
useful. It's better to silence the leak. That's what this patch does by ensuring that all
hashtables, including leaked ones, end up in a global vector. This is perf-neutral.
This requires making a StaticWordLock variant of WordLock. That's probably the biggest part
of this change.
* wtf/ParkingLot.cpp:
* wtf/WordLock.cpp:
(WTF::WordLockBase::lockSlow):
(WTF::WordLockBase::unlockSlow):
(WTF::WordLock::lockSlow): Deleted.
(WTF::WordLock::unlockSlow): Deleted.
* wtf/WordLock.h:
(WTF::WordLockBase::lock):
(WTF::WordLockBase::isLocked):
(WTF::WordLock::WordLock):
(WTF::WordLock::lock): Deleted.
(WTF::WordLock::isLocked): Deleted.
Canonical link: https://commits.webkit.org/173707@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@198345 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-03-17 18:54:15 +00:00
struct Hashtable ;
// We track all allocated hashtables so that hashtable resizing doesn't anger leak detectors.
Vector < Hashtable * > * hashtables ;
2018-04-05 17:22:21 +00:00
WordLock hashtablesLock ;
Silence leaks in ParkingLot
https://bugs.webkit.org/show_bug.cgi?id=155510
Reviewed by Alexey Proskuryakov.
ParkingLot has a concurrent hashtable that it reallocates on demand. It will not reallocate
it in steady state. The hashtable is sized to accommodate the high watermark of the number
of active threads - so long as the program doesn't just keep starting an unbounded number
of threads that are all active, the hashtable will stop resizing. Each resize operation is
designed to stay out of the way of the data-access-parallel normal path, in which two
threads operating on different lock addresses don't have to synchronize. To do this, it
simply drops the old hashtable without deleting it, so that threads that were still using
it don't crash. They will realize that they have the wrong hashtable before doing anything
bad, but we don't have a way of proving when all of those threads are no longer going to
read from the old hashtables. So, we just leak them.
This is a bounded leak, since the hashtable resizes exponentially. Thus the total memory
utilization of all hashtables, including the leaked ones, converges to a linear function of
the current hashtable's size (it's 2 * size of current hashtable).
But this leak is a problem for leaks tools, which will always report this leak. This is not
useful. It's better to silence the leak. That's what this patch does by ensuring that all
hashtables, including leaked ones, end up in a global vector. This is perf-neutral.
This requires making a StaticWordLock variant of WordLock. That's probably the biggest part
of this change.
* wtf/ParkingLot.cpp:
* wtf/WordLock.cpp:
(WTF::WordLockBase::lockSlow):
(WTF::WordLockBase::unlockSlow):
(WTF::WordLock::lockSlow): Deleted.
(WTF::WordLock::unlockSlow): Deleted.
* wtf/WordLock.h:
(WTF::WordLockBase::lock):
(WTF::WordLockBase::isLocked):
(WTF::WordLock::WordLock):
(WTF::WordLock::lock): Deleted.
(WTF::WordLock::isLocked): Deleted.
Canonical link: https://commits.webkit.org/173707@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@198345 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-03-17 18:54:15 +00:00
2015-08-11 19:51:35 +00:00
struct Hashtable {
unsigned size ;
Atomic < Bucket * > data [ 1 ] ;
static Hashtable * create ( unsigned size )
{
ASSERT ( size > = 1 ) ;
Hashtable * result = static_cast < Hashtable * > (
fastZeroedMalloc ( sizeof ( Hashtable ) + sizeof ( Atomic < Bucket * > ) * ( size - 1 ) ) ) ;
result - > size = size ;
Silence leaks in ParkingLot
https://bugs.webkit.org/show_bug.cgi?id=155510
Reviewed by Alexey Proskuryakov.
ParkingLot has a concurrent hashtable that it reallocates on demand. It will not reallocate
it in steady state. The hashtable is sized to accommodate the high watermark of the number
of active threads - so long as the program doesn't just keep starting an unbounded number
of threads that are all active, the hashtable will stop resizing. Each resize operation is
designed to stay out of the way of the data-access-parallel normal path, in which two
threads operating on different lock addresses don't have to synchronize. To do this, it
simply drops the old hashtable without deleting it, so that threads that were still using
it don't crash. They will realize that they have the wrong hashtable before doing anything
bad, but we don't have a way of proving when all of those threads are no longer going to
read from the old hashtables. So, we just leak them.
This is a bounded leak, since the hashtable resizes exponentially. Thus the total memory
utilization of all hashtables, including the leaked ones, converges to a linear function of
the current hashtable's size (it's 2 * size of current hashtable).
But this leak is a problem for leaks tools, which will always report this leak. This is not
useful. It's better to silence the leak. That's what this patch does by ensuring that all
hashtables, including leaked ones, end up in a global vector. This is perf-neutral.
This requires making a StaticWordLock variant of WordLock. That's probably the biggest part
of this change.
* wtf/ParkingLot.cpp:
* wtf/WordLock.cpp:
(WTF::WordLockBase::lockSlow):
(WTF::WordLockBase::unlockSlow):
(WTF::WordLock::lockSlow): Deleted.
(WTF::WordLock::unlockSlow): Deleted.
* wtf/WordLock.h:
(WTF::WordLockBase::lock):
(WTF::WordLockBase::isLocked):
(WTF::WordLock::WordLock):
(WTF::WordLock::lock): Deleted.
(WTF::WordLock::isLocked): Deleted.
Canonical link: https://commits.webkit.org/173707@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@198345 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-03-17 18:54:15 +00:00
{
// This is not fast and it's not data-access parallel, but that's fine, because
// hashtable resizing is guaranteed to be rare and it will never happen in steady
// state.
WordLockHolder locker ( hashtablesLock ) ;
if ( ! hashtables )
hashtables = new Vector < Hashtable * > ( ) ;
hashtables - > append ( result ) ;
}
2015-08-11 19:51:35 +00:00
return result ;
}
static void destroy ( Hashtable * hashtable )
{
Silence leaks in ParkingLot
https://bugs.webkit.org/show_bug.cgi?id=155510
Reviewed by Alexey Proskuryakov.
ParkingLot has a concurrent hashtable that it reallocates on demand. It will not reallocate
it in steady state. The hashtable is sized to accommodate the high watermark of the number
of active threads - so long as the program doesn't just keep starting an unbounded number
of threads that are all active, the hashtable will stop resizing. Each resize operation is
designed to stay out of the way of the data-access-parallel normal path, in which two
threads operating on different lock addresses don't have to synchronize. To do this, it
simply drops the old hashtable without deleting it, so that threads that were still using
it don't crash. They will realize that they have the wrong hashtable before doing anything
bad, but we don't have a way of proving when all of those threads are no longer going to
read from the old hashtables. So, we just leak them.
This is a bounded leak, since the hashtable resizes exponentially. Thus the total memory
utilization of all hashtables, including the leaked ones, converges to a linear function of
the current hashtable's size (it's 2 * size of current hashtable).
But this leak is a problem for leaks tools, which will always report this leak. This is not
useful. It's better to silence the leak. That's what this patch does by ensuring that all
hashtables, including leaked ones, end up in a global vector. This is perf-neutral.
This requires making a StaticWordLock variant of WordLock. That's probably the biggest part
of this change.
* wtf/ParkingLot.cpp:
* wtf/WordLock.cpp:
(WTF::WordLockBase::lockSlow):
(WTF::WordLockBase::unlockSlow):
(WTF::WordLock::lockSlow): Deleted.
(WTF::WordLock::unlockSlow): Deleted.
* wtf/WordLock.h:
(WTF::WordLockBase::lock):
(WTF::WordLockBase::isLocked):
(WTF::WordLock::WordLock):
(WTF::WordLock::lock): Deleted.
(WTF::WordLock::isLocked): Deleted.
Canonical link: https://commits.webkit.org/173707@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@198345 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-03-17 18:54:15 +00:00
{
// This is not fast, but that's OK. See comment in create().
WordLockHolder locker ( hashtablesLock ) ;
hashtables - > removeFirst ( hashtable ) ;
}
2015-08-11 19:51:35 +00:00
fastFree ( hashtable ) ;
}
} ;
Atomic < Hashtable * > hashtable ;
Atomic < unsigned > numThreads ;
// With 64 bytes of padding per bucket, assuming a hashtable is fully populated with buckets, the
// memory usage per thread will still be less than 1KB.
const unsigned maxLoadFactor = 3 ;
const unsigned growthFactor = 2 ;
unsigned hashAddress ( const void * address )
{
return WTF : : PtrHash < const void * > : : hash ( address ) ;
}
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
Hashtable * ensureHashtable ( )
2015-08-11 19:51:35 +00:00
{
for ( ; ; ) {
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
Hashtable * currentHashtable = hashtable . load ( ) ;
if ( currentHashtable )
return currentHashtable ;
2015-08-11 19:51:35 +00:00
if ( ! currentHashtable ) {
currentHashtable = Hashtable : : create ( maxLoadFactor ) ;
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
if ( hashtable . compareExchangeWeak ( nullptr , currentHashtable ) ) {
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : created initial hashtable " , RawPointer ( currentHashtable ) , " \n " ) ) ;
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
return currentHashtable ;
2015-08-11 19:51:35 +00:00
}
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
Hashtable : : destroy ( currentHashtable ) ;
2015-08-11 19:51:35 +00:00
}
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
}
}
// Locks the hashtable. This reloops in case of rehashing, so the current hashtable may be different
// after this returns than when you called it. Guarantees that there is a hashtable. This is pretty
// slow and not scalable, so it's only used during thread creation and for debugging/testing.
Vector < Bucket * > lockHashtable ( )
{
for ( ; ; ) {
Hashtable * currentHashtable = ensureHashtable ( ) ;
ASSERT ( currentHashtable ) ;
2015-08-11 19:51:35 +00:00
// Now find all of the buckets. This makes sure that the hashtable is full of buckets so that
// we can lock all of the buckets, not just the ones that are materialized.
Vector < Bucket * > buckets ;
for ( unsigned i = currentHashtable - > size ; i - - ; ) {
Atomic < Bucket * > & bucketPointer = currentHashtable - > data [ i ] ;
for ( ; ; ) {
Bucket * bucket = bucketPointer . load ( ) ;
if ( ! bucket ) {
bucket = new Bucket ( ) ;
if ( ! bucketPointer . compareExchangeWeak ( nullptr , bucket ) ) {
delete bucket ;
continue ;
}
}
buckets . append ( bucket ) ;
break ;
}
}
// Now lock the buckets in the right order.
std : : sort ( buckets . begin ( ) , buckets . end ( ) ) ;
for ( Bucket * bucket : buckets )
bucket - > lock . lock ( ) ;
// If the hashtable didn't change (wasn't rehashed) while we were locking it, then we own it
// now.
if ( hashtable . load ( ) = = currentHashtable )
return buckets ;
// The hashtable rehashed. Unlock everything and try again.
for ( Bucket * bucket : buckets )
bucket - > lock . unlock ( ) ;
}
}
void unlockHashtable ( const Vector < Bucket * > & buckets )
{
for ( Bucket * bucket : buckets )
bucket - > lock . unlock ( ) ;
}
// Rehash the hashtable to handle numThreads threads.
void ensureHashtableSize ( unsigned numThreads )
{
// We try to ensure that the size of the hashtable used for thread queues is always large enough
// to avoid collisions. So, since we started a new thread, we may need to increase the size of the
// hashtable. This does just that. Note that we never free the old spine, since we never lock
// around spine accesses (i.e. the "hashtable" global variable).
// First do a fast check to see if rehashing is needed.
Hashtable * oldHashtable = hashtable . load ( ) ;
if ( oldHashtable & & static_cast < double > ( oldHashtable - > size ) / static_cast < double > ( numThreads ) > = maxLoadFactor ) {
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : no need to rehash because " , oldHashtable - > size , " / " , numThreads , " >= " , maxLoadFactor , " \n " ) ) ;
2015-08-11 19:51:35 +00:00
return ;
}
// Seems like we *might* have to rehash, so lock the hashtable and try again.
Vector < Bucket * > bucketsToUnlock = lockHashtable ( ) ;
// Check again, since the hashtable could have rehashed while we were locking it. Also,
// lockHashtable() creates an initial hashtable for us.
oldHashtable = hashtable . load ( ) ;
2019-09-30 02:21:48 +00:00
RELEASE_ASSERT ( oldHashtable ) ;
if ( static_cast < double > ( oldHashtable - > size ) / static_cast < double > ( numThreads ) > = maxLoadFactor ) {
2015-08-11 19:51:35 +00:00
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : after locking, no need to rehash because " , oldHashtable - > size , " / " , numThreads , " >= " , maxLoadFactor , " \n " ) ) ;
2015-08-11 19:51:35 +00:00
unlockHashtable ( bucketsToUnlock ) ;
return ;
}
Vector < Bucket * > reusableBuckets = bucketsToUnlock ;
// OK, now we resize. First we gather all thread datas from the old hashtable. These thread datas
// are placed into the vector in queue order.
Vector < ThreadData * > threadDatas ;
for ( Bucket * bucket : reusableBuckets ) {
while ( ThreadData * threadData = bucket - > dequeue ( ) )
threadDatas . append ( threadData ) ;
}
unsigned newSize = numThreads * growthFactor * maxLoadFactor ;
RELEASE_ASSERT ( newSize > oldHashtable - > size ) ;
Hashtable * newHashtable = Hashtable : : create ( newSize ) ;
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : created new hashtable: " , RawPointer ( newHashtable ) , " \n " ) ) ;
2015-08-11 19:51:35 +00:00
for ( ThreadData * threadData : threadDatas ) {
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : rehashing thread data " , RawPointer ( threadData ) , " with address = " , RawPointer ( threadData - > address ) , " \n " ) ) ;
2015-08-11 19:51:35 +00:00
unsigned hash = hashAddress ( threadData - > address ) ;
unsigned index = hash % newHashtable - > size ;
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : index = " , index , " \n " ) ) ;
2015-08-11 19:51:35 +00:00
Bucket * bucket = newHashtable - > data [ index ] . load ( ) ;
if ( ! bucket ) {
if ( reusableBuckets . isEmpty ( ) )
bucket = new Bucket ( ) ;
else
bucket = reusableBuckets . takeLast ( ) ;
newHashtable - > data [ index ] . store ( bucket ) ;
}
bucket - > enqueue ( threadData ) ;
}
// At this point there may be some buckets left unreused. This could easily happen if the
// number of enqueued threads right now is low but the high watermark of the number of threads
// enqueued was high. We place these buckets into the hashtable basically at random, just to
// make sure we don't leak them.
for ( unsigned i = 0 ; i < newHashtable - > size & & ! reusableBuckets . isEmpty ( ) ; + + i ) {
Atomic < Bucket * > & bucketPtr = newHashtable - > data [ i ] ;
if ( bucketPtr . load ( ) )
continue ;
bucketPtr . store ( reusableBuckets . takeLast ( ) ) ;
}
// Since we increased the size of the hashtable, we should have exhausted our preallocated
// buckets by now.
ASSERT ( reusableBuckets . isEmpty ( ) ) ;
// OK, right now the old hashtable is locked up and the new hashtable is ready to rock and
// roll. After we install the new hashtable, we can release all bucket locks.
JSC should support SharedArrayBuffer
https://bugs.webkit.org/show_bug.cgi?id=163986
Reviewed by Keith Miller.
JSTests:
This adds our own test for the various corner cases of SharedArrayBuffer. This test is meant to
check all of the things that don't require concurrency.
* stress/SharedArrayBuffer.js: Added.
(checkAtomics):
(shouldFail):
(Symbol):
(runAtomic):
Source/JavaScriptCore:
This implements https://tc39.github.io/ecmascript_sharedmem/shmem.html.
There is now a new SharedArrayBuffer type. In the JS runtime, which includes typed array
types, the SharedArrayBuffer is a drop-in replacement for ArrayBuffer, even though they are
distinct types (new SharedArrayBuffer() instanceof ArrayBuffer == false and vice versa). The
DOM will not recognize SharedArrayBuffer, or any typed array that wraps it, to ensure safety.
This matches what other browsers intend to do, see
https://github.com/tc39/ecmascript_sharedmem/issues/38. API is provided for the DOM to opt
into SharedArrayBuffer. One notable place is postMessage, which will share the
SharedArrayBuffer's underlying data storage with other workers. This creates a pool of shared
memory that the workers can use to talk to each other.
There is also an Atomics object in global scope, which exposes sequentially consistent atomic
operations: add, and, compareExchange, exchange, load, or, store, sub, and xor. Additionally
it exposes a Atomics.isLockFree utility, which takes a byte amount and returns true or false.
Also there is Atomics.wake/wait, which neatly map to ParkingLot.
Accesses to typed arrays that wrap SharedArrayBuffer are optimized by JSC the same way as
always. I believe that DFG and B3 already obey the following memory model, which I believe is
a bit weaker than Cambridge and a bit stronger than what is being proposed for
SharedArrayBuffer. To predict a program's behavior under the B3 memory model, imagine the
space of all possible programs that would result from running an optimizer that adversarially
follows B3's transformation rules. B3 transformations are correct if the newly created
program is equivalent to the old one, assuming that any opaque effect in IR (like the reads
and writes of a patchpoint/call/fence) could perform any load/store that satisfies the
B3::Effects summary. Opaque effects are a way of describing an infinite set of programs: any
program that only does the effects summarized in B3::Effects belongs to the set. For example,
this prevents motion of operations across fences since fences are summarized as opaque
effects that could read or write memory. This rule alone is not enough, because it leaves the
door open for turning an atomic operation (like a load) into a non-atomic one (like a load
followed by a store of the same value back to the same location or multiple loads). This is
not an optimization that either our compiler or the CPU would want to do. One way to think of
what exactly is forbidden is that B3 transformations that mess with memory accesses can only
reorder them or remove them. This means that for any execution of the untransformed program,
the corresponding execution of the transformed program (i.e. with the same input arguments
and the same programs filled in for the opaque effects) must have the same loads and stores,
with some removed and some reordered. This is a fairly simple mental model that B3 and DFG
already follow and it's based on existing abstractions for the infinite set of programs
inside an opaque effect (DFG's AbstractHeaps and B3's Effects).
This patch makes all atomics operations intrinsic, but the DFG doesn't know about any of them
yet. That's covered by bug 164108.
This ought to be perf-neutral, but I am still running tests to confirm this. I'm also still
writing new tests to cover all of the Atomics functionality and the behavior of SAB objects.
* API/JSTypedArray.cpp:
(JSObjectGetTypedArrayBytesPtr):
(JSObjectGetTypedArrayBuffer):
(JSObjectMakeArrayBufferWithBytesNoCopy):
* API/tests/CompareAndSwapTest.cpp:
(Bitmap::concurrentTestAndSet):
* CMakeLists.txt:
* JavaScriptCore.xcodeproj/project.pbxproj:
* dfg/DFGDesiredWatchpoints.cpp:
(JSC::DFG::ArrayBufferViewWatchpointAdaptor::add):
* heap/Heap.cpp:
(JSC::Heap::reportExtraMemoryVisited):
(JSC::Heap::reportExternalMemoryVisited):
* jsc.cpp:
(functionTransferArrayBuffer):
* runtime/ArrayBuffer.cpp:
(JSC::SharedArrayBufferContents::SharedArrayBufferContents):
(JSC::SharedArrayBufferContents::~SharedArrayBufferContents):
(JSC::ArrayBufferContents::ArrayBufferContents):
(JSC::ArrayBufferContents::operator=):
(JSC::ArrayBufferContents::~ArrayBufferContents):
(JSC::ArrayBufferContents::clear):
(JSC::ArrayBufferContents::destroy):
(JSC::ArrayBufferContents::reset):
(JSC::ArrayBufferContents::tryAllocate):
(JSC::ArrayBufferContents::makeShared):
(JSC::ArrayBufferContents::transferTo):
(JSC::ArrayBufferContents::copyTo):
(JSC::ArrayBufferContents::shareWith):
(JSC::ArrayBuffer::create):
(JSC::ArrayBuffer::createAdopted):
(JSC::ArrayBuffer::createFromBytes):
(JSC::ArrayBuffer::tryCreate):
(JSC::ArrayBuffer::createUninitialized):
(JSC::ArrayBuffer::tryCreateUninitialized):
(JSC::ArrayBuffer::createInternal):
(JSC::ArrayBuffer::ArrayBuffer):
(JSC::ArrayBuffer::slice):
(JSC::ArrayBuffer::sliceImpl):
(JSC::ArrayBuffer::makeShared):
(JSC::ArrayBuffer::setSharingMode):
(JSC::ArrayBuffer::transferTo):
(JSC::ArrayBuffer::transfer): Deleted.
* runtime/ArrayBuffer.h:
(JSC::arrayBufferSharingModeName):
(JSC::SharedArrayBufferContents::data):
(JSC::ArrayBufferContents::data):
(JSC::ArrayBufferContents::sizeInBytes):
(JSC::ArrayBufferContents::isShared):
(JSC::ArrayBuffer::sharingMode):
(JSC::ArrayBuffer::isShared):
(JSC::ArrayBuffer::gcSizeEstimateInBytes):
(JSC::arrayBufferDestructorNull): Deleted.
(JSC::arrayBufferDestructorDefault): Deleted.
(JSC::ArrayBufferContents::ArrayBufferContents): Deleted.
(JSC::ArrayBufferContents::transfer): Deleted.
(JSC::ArrayBufferContents::copyTo): Deleted.
(JSC::ArrayBuffer::create): Deleted.
(JSC::ArrayBuffer::createAdopted): Deleted.
(JSC::ArrayBuffer::createFromBytes): Deleted.
(JSC::ArrayBuffer::tryCreate): Deleted.
(JSC::ArrayBuffer::createUninitialized): Deleted.
(JSC::ArrayBuffer::tryCreateUninitialized): Deleted.
(JSC::ArrayBuffer::createInternal): Deleted.
(JSC::ArrayBuffer::ArrayBuffer): Deleted.
(JSC::ArrayBuffer::slice): Deleted.
(JSC::ArrayBuffer::sliceImpl): Deleted.
(JSC::ArrayBufferContents::tryAllocate): Deleted.
(JSC::ArrayBufferContents::~ArrayBufferContents): Deleted.
* runtime/ArrayBufferSharingMode.h: Added.
* runtime/ArrayBufferView.h:
(JSC::ArrayBufferView::possiblySharedBuffer):
(JSC::ArrayBufferView::unsharedBuffer):
(JSC::ArrayBufferView::isShared):
(JSC::ArrayBufferView::buffer): Deleted.
* runtime/AtomicsObject.cpp: Added.
(JSC::AtomicsObject::AtomicsObject):
(JSC::AtomicsObject::create):
(JSC::AtomicsObject::createStructure):
(JSC::AtomicsObject::finishCreation):
(JSC::atomicsFuncAdd):
(JSC::atomicsFuncAnd):
(JSC::atomicsFuncCompareExchange):
(JSC::atomicsFuncExchange):
(JSC::atomicsFuncIsLockFree):
(JSC::atomicsFuncLoad):
(JSC::atomicsFuncOr):
(JSC::atomicsFuncStore):
(JSC::atomicsFuncSub):
(JSC::atomicsFuncWait):
(JSC::atomicsFuncWake):
(JSC::atomicsFuncXor):
* runtime/AtomicsObject.h: Added.
* runtime/CommonIdentifiers.h:
* runtime/DataView.cpp:
(JSC::DataView::wrap):
* runtime/GenericTypedArrayViewInlines.h:
(JSC::GenericTypedArrayView<Adaptor>::subarray):
* runtime/Intrinsic.h:
* runtime/JSArrayBuffer.cpp:
(JSC::JSArrayBuffer::finishCreation):
(JSC::JSArrayBuffer::isShared):
(JSC::JSArrayBuffer::sharingMode):
* runtime/JSArrayBuffer.h:
(JSC::toPossiblySharedArrayBuffer):
(JSC::toUnsharedArrayBuffer):
(JSC::JSArrayBuffer::toWrapped):
(JSC::toArrayBuffer): Deleted.
* runtime/JSArrayBufferConstructor.cpp:
(JSC::JSArrayBufferConstructor::JSArrayBufferConstructor):
(JSC::JSArrayBufferConstructor::finishCreation):
(JSC::JSArrayBufferConstructor::create):
(JSC::constructArrayBuffer):
* runtime/JSArrayBufferConstructor.h:
(JSC::JSArrayBufferConstructor::sharingMode):
* runtime/JSArrayBufferPrototype.cpp:
(JSC::arrayBufferProtoFuncSlice):
(JSC::JSArrayBufferPrototype::JSArrayBufferPrototype):
(JSC::JSArrayBufferPrototype::finishCreation):
(JSC::JSArrayBufferPrototype::create):
* runtime/JSArrayBufferPrototype.h:
* runtime/JSArrayBufferView.cpp:
(JSC::JSArrayBufferView::finishCreation):
(JSC::JSArrayBufferView::visitChildren):
(JSC::JSArrayBufferView::unsharedBuffer):
(JSC::JSArrayBufferView::unsharedJSBuffer):
(JSC::JSArrayBufferView::possiblySharedJSBuffer):
(JSC::JSArrayBufferView::neuter):
(JSC::JSArrayBufferView::toWrapped): Deleted.
* runtime/JSArrayBufferView.h:
(JSC::JSArrayBufferView::jsBuffer): Deleted.
* runtime/JSArrayBufferViewInlines.h:
(JSC::JSArrayBufferView::isShared):
(JSC::JSArrayBufferView::possiblySharedBuffer):
(JSC::JSArrayBufferView::possiblySharedImpl):
(JSC::JSArrayBufferView::unsharedImpl):
(JSC::JSArrayBufferView::byteOffset):
(JSC::JSArrayBufferView::toWrapped):
(JSC::JSArrayBufferView::buffer): Deleted.
(JSC::JSArrayBufferView::impl): Deleted.
(JSC::JSArrayBufferView::neuter): Deleted.
* runtime/JSDataView.cpp:
(JSC::JSDataView::possiblySharedTypedImpl):
(JSC::JSDataView::unsharedTypedImpl):
(JSC::JSDataView::getTypedArrayImpl):
(JSC::JSDataView::typedImpl): Deleted.
* runtime/JSDataView.h:
(JSC::JSDataView::possiblySharedBuffer):
(JSC::JSDataView::unsharedBuffer):
(JSC::JSDataView::buffer): Deleted.
* runtime/JSDataViewPrototype.cpp:
(JSC::dataViewProtoGetterBuffer):
* runtime/JSGenericTypedArrayView.h:
(JSC::toPossiblySharedNativeTypedView):
(JSC::toUnsharedNativeTypedView):
(JSC::JSGenericTypedArrayView<Adaptor>::toWrapped):
(JSC::JSGenericTypedArrayView::typedImpl): Deleted.
(JSC::toNativeTypedView): Deleted.
* runtime/JSGenericTypedArrayViewInlines.h:
(JSC::JSGenericTypedArrayView<Adaptor>::create):
(JSC::JSGenericTypedArrayView<Adaptor>::possiblySharedTypedImpl):
(JSC::JSGenericTypedArrayView<Adaptor>::unsharedTypedImpl):
(JSC::JSGenericTypedArrayView<Adaptor>::getTypedArrayImpl):
* runtime/JSGenericTypedArrayViewPrototypeFunctions.h:
(JSC::genericTypedArrayViewProtoGetterFuncBuffer):
(JSC::genericTypedArrayViewPrivateFuncSubarrayCreate):
* runtime/JSGlobalObject.cpp:
(JSC::createAtomicsProperty):
(JSC::JSGlobalObject::init):
(JSC::JSGlobalObject::visitChildren):
* runtime/JSGlobalObject.h:
(JSC::JSGlobalObject::arrayBufferPrototype):
(JSC::JSGlobalObject::arrayBufferStructure):
* runtime/MathObject.cpp:
* runtime/RuntimeFlags.h:
* runtime/SimpleTypedArrayController.cpp:
(JSC::SimpleTypedArrayController::toJS):
* runtime/TypedArrayType.h:
(JSC::typedArrayTypeForType):
Source/WebCore:
New tests added in the LayoutTests/workers/sab directory.
This teaches WebCore that a typed array could be shared or not. By default, WebCore will
reject shared typed arrays as if they were not typed arrays. This ensures that we don't get
race conditions in code that can't handle it.
If you postMessage a SharedArrayBuffer or something that wraps it, you will send the shared
memory to the other worker.
* Modules/encryptedmedia/CDMSessionClearKey.cpp:
(WebCore::CDMSessionClearKey::cachedKeyForKeyID):
* Modules/fetch/FetchBody.cpp:
(WebCore::FetchBody::extract):
* Modules/mediastream/RTCDataChannel.cpp:
(WebCore::RTCDataChannel::send):
* Modules/webaudio/AudioBuffer.cpp:
(WebCore::AudioBuffer::getChannelData):
* Modules/websockets/WebSocket.cpp:
(WebCore::WebSocket::send):
* bindings/js/JSBlobCustom.cpp:
(WebCore::constructJSBlob):
* bindings/js/JSCryptoAlgorithmDictionary.cpp:
(WebCore::createRsaKeyGenParams):
* bindings/js/JSCryptoCustom.cpp:
(WebCore::JSCrypto::getRandomValues):
* bindings/js/JSCryptoOperationData.cpp:
(WebCore::cryptoOperationDataFromJSValue):
* bindings/js/JSDOMBinding.h:
(WebCore::toJS):
(WebCore::toPossiblySharedArrayBufferView):
(WebCore::toUnsharedArrayBufferView):
(WebCore::toPossiblySharedInt8Array):
(WebCore::toPossiblySharedInt16Array):
(WebCore::toPossiblySharedInt32Array):
(WebCore::toPossiblySharedUint8Array):
(WebCore::toPossiblySharedUint8ClampedArray):
(WebCore::toPossiblySharedUint16Array):
(WebCore::toPossiblySharedUint32Array):
(WebCore::toPossiblySharedFloat32Array):
(WebCore::toPossiblySharedFloat64Array):
(WebCore::toUnsharedInt8Array):
(WebCore::toUnsharedInt16Array):
(WebCore::toUnsharedInt32Array):
(WebCore::toUnsharedUint8Array):
(WebCore::toUnsharedUint8ClampedArray):
(WebCore::toUnsharedUint16Array):
(WebCore::toUnsharedUint32Array):
(WebCore::toUnsharedFloat32Array):
(WebCore::toUnsharedFloat64Array):
(WebCore::toArrayBufferView): Deleted.
(WebCore::toInt8Array): Deleted.
(WebCore::toInt16Array): Deleted.
(WebCore::toInt32Array): Deleted.
(WebCore::toUint8Array): Deleted.
(WebCore::toUint8ClampedArray): Deleted.
(WebCore::toUint16Array): Deleted.
(WebCore::toUint32Array): Deleted.
(WebCore::toFloat32Array): Deleted.
(WebCore::toFloat64Array): Deleted.
* bindings/js/JSDataCueCustom.cpp:
(WebCore::constructJSDataCue):
* bindings/js/JSDictionary.cpp:
(WebCore::JSDictionary::convertValue):
* bindings/js/JSFileCustom.cpp:
(WebCore::constructJSFile):
* bindings/js/JSMessagePortCustom.cpp:
(WebCore::extractTransferables):
* bindings/js/JSWebGLRenderingContextBaseCustom.cpp:
(WebCore::dataFunctionf):
(WebCore::dataFunctioni):
(WebCore::dataFunctionMatrix):
* bindings/js/JSXMLHttpRequestCustom.cpp:
(WebCore::JSXMLHttpRequest::send):
* bindings/js/SerializedScriptValue.cpp:
(WebCore::CloneSerializer::dumpArrayBufferView):
(WebCore::CloneSerializer::dumpIfTerminal):
(WebCore::CloneDeserializer::readArrayBufferView):
(WebCore::CloneDeserializer::readTerminal):
(WebCore::SerializedScriptValue::transferArrayBuffers):
* bindings/js/StructuredClone.cpp:
(WebCore::structuredCloneArrayBuffer):
(WebCore::structuredCloneArrayBufferView):
* bindings/scripts/CodeGeneratorJS.pm:
(JSValueToNative):
* css/FontFace.cpp:
(WebCore::FontFace::create):
* html/canvas/WebGL2RenderingContext.cpp:
(WebCore::WebGL2RenderingContext::bufferData):
(WebCore::WebGL2RenderingContext::bufferSubData):
* platform/graphics/avfoundation/MediaPlayerPrivateAVFoundation.cpp:
(WebCore::MediaPlayerPrivateAVFoundation::extractKeyURIKeyIDAndCertificateFromInitData):
Source/WebKit/mac:
Support the RuntimeFlag.
* WebView/WebPreferencesPrivate.h:
Source/WebKit/win:
Support the RuntimeFlag.
* Interfaces/IWebPreferencesPrivate.idl:
Source/WebKit2:
Adds some small things we need for SharedArrayBuffer.
* UIProcess/API/C/WKPreferencesRefPrivate.h:
* UIProcess/API/Cocoa/WKPreferencesPrivate.h:
* WebProcess/InjectedBundle/InjectedBundle.cpp:
(WebKit::InjectedBundle::createWebDataFromUint8Array):
Source/WTF:
Adds some small things we need for SharedArrayBuffer.
* wtf/Atomics.h:
(WTF::Atomic::compareExchangeWeakRelaxed):
(WTF::Atomic::exchangeAdd):
(WTF::Atomic::exchangeAnd):
(WTF::Atomic::exchangeOr):
(WTF::Atomic::exchangeSub):
(WTF::Atomic::exchangeXor):
(WTF::atomicLoad):
(WTF::atomicStore):
(WTF::atomicCompareExchangeWeak):
(WTF::atomicCompareExchangeWeakRelaxed):
(WTF::atomicCompareExchangeStrong):
(WTF::atomicExchangeAdd):
(WTF::atomicExchangeAnd):
(WTF::atomicExchangeOr):
(WTF::atomicExchangeSub):
(WTF::atomicExchangeXor):
(WTF::atomicExchange):
(WTF::Atomic::exchangeAndAdd): Deleted.
(WTF::weakCompareAndSwap): Deleted.
We need to be able to do atomics operations on naked pointers. We also need to be able to do
all of the things that std::atomic does. This adds those things and renames
weakCompareAndSwap to atomicCompareExchangeWeakRelaxed so that we're using consistent
terminology.
* wtf/Bitmap.h:
(WTF::WordType>::concurrentTestAndSet): Renamed weakCompareAndSwap.
(WTF::WordType>::concurrentTestAndClear): Renamed weakCompareAndSwap.
* wtf/FastBitVector.h:
(WTF::FastBitVector::atomicSetAndCheck): Renamed weakCompareAndSwap.
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkCount):
* wtf/ParkingLot.h:
Added unparkCount(), which lets you unpark some bounded number of threads and returns the
number of threads unparked. This is just a modest extension of unparkAll(). unparkAll() now
just calls unparkCount(ptr, UINT_MAX).
Tools:
Use the right kind of typed array API.
* DumpRenderTree/TestRunner.cpp:
(setAudioResultCallback):
LayoutTests:
Adding tests. This is a work in progress.
* workers/sab: Added.
* workers/sab/simple-worker-1.js: Added.
(onmessage):
* workers/sab/simple-worker-2.js: Added.
(onmessage):
* workers/sab/simple.html: Added.
Canonical link: https://commits.webkit.org/181984@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208209 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-01 03:10:00 +00:00
bool result = hashtable . compareExchangeStrong ( oldHashtable , newHashtable ) = = oldHashtable ;
2015-08-11 19:51:35 +00:00
RELEASE_ASSERT ( result ) ;
unlockHashtable ( bucketsToUnlock ) ;
}
ThreadData : : ThreadData ( )
[WTF] Introduce Thread class and use RefPtr<Thread> and align Windows Threading implementation semantics to Pthread one
https://bugs.webkit.org/show_bug.cgi?id=170502
Reviewed by Mark Lam.
Source/JavaScriptCore:
* API/tests/CompareAndSwapTest.cpp:
(testCompareAndSwap):
* JavaScriptCore.xcodeproj/project.pbxproj:
* b3/air/testair.cpp:
* b3/testb3.cpp:
(JSC::B3::run):
* bytecode/SuperSampler.cpp:
(JSC::initializeSuperSampler):
* dfg/DFGWorklist.cpp:
* disassembler/Disassembler.cpp:
* heap/Heap.cpp:
(JSC::Heap::lastChanceToFinalize):
(JSC::Heap::notifyIsSafeToCollect):
* heap/Heap.h:
* heap/MachineStackMarker.cpp:
(JSC::MachineThreads::~MachineThreads):
(JSC::MachineThreads::addCurrentThread):
(JSC::MachineThreads::removeThread):
(JSC::MachineThreads::removeThreadIfFound):
(JSC::MachineThreads::MachineThread::MachineThread):
(JSC::MachineThreads::MachineThread::getRegisters):
(JSC::MachineThreads::MachineThread::Registers::stackPointer):
(JSC::MachineThreads::MachineThread::Registers::framePointer):
(JSC::MachineThreads::MachineThread::Registers::instructionPointer):
(JSC::MachineThreads::MachineThread::Registers::llintPC):
(JSC::MachineThreads::MachineThread::captureStack):
(JSC::MachineThreads::tryCopyOtherThreadStack):
(JSC::MachineThreads::tryCopyOtherThreadStacks):
(pthreadSignalHandlerSuspendResume): Deleted.
(JSC::threadData): Deleted.
(JSC::MachineThreads::Thread::Thread): Deleted.
(JSC::MachineThreads::Thread::createForCurrentThread): Deleted.
(JSC::MachineThreads::Thread::operator==): Deleted.
(JSC::MachineThreads::machineThreadForCurrentThread): Deleted.
(JSC::MachineThreads::ThreadData::ThreadData): Deleted.
(JSC::MachineThreads::ThreadData::~ThreadData): Deleted.
(JSC::MachineThreads::ThreadData::suspend): Deleted.
(JSC::MachineThreads::ThreadData::resume): Deleted.
(JSC::MachineThreads::ThreadData::getRegisters): Deleted.
(JSC::MachineThreads::ThreadData::Registers::stackPointer): Deleted.
(JSC::MachineThreads::ThreadData::Registers::framePointer): Deleted.
(JSC::MachineThreads::ThreadData::Registers::instructionPointer): Deleted.
(JSC::MachineThreads::ThreadData::Registers::llintPC): Deleted.
(JSC::MachineThreads::ThreadData::freeRegisters): Deleted.
(JSC::MachineThreads::ThreadData::captureStack): Deleted.
* heap/MachineStackMarker.h:
(JSC::MachineThreads::MachineThread::suspend):
(JSC::MachineThreads::MachineThread::resume):
(JSC::MachineThreads::MachineThread::threadID):
(JSC::MachineThreads::MachineThread::stackBase):
(JSC::MachineThreads::MachineThread::stackEnd):
(JSC::MachineThreads::threadsListHead):
(JSC::MachineThreads::Thread::operator!=): Deleted.
(JSC::MachineThreads::Thread::suspend): Deleted.
(JSC::MachineThreads::Thread::resume): Deleted.
(JSC::MachineThreads::Thread::getRegisters): Deleted.
(JSC::MachineThreads::Thread::freeRegisters): Deleted.
(JSC::MachineThreads::Thread::captureStack): Deleted.
(JSC::MachineThreads::Thread::platformThread): Deleted.
(JSC::MachineThreads::Thread::stackBase): Deleted.
(JSC::MachineThreads::Thread::stackEnd): Deleted.
* jit/ICStats.cpp:
(JSC::ICStats::ICStats):
(JSC::ICStats::~ICStats):
* jit/ICStats.h:
* jsc.cpp:
(functionDollarAgentStart):
(startTimeoutThreadIfNeeded):
* runtime/JSLock.cpp:
(JSC::JSLock::lock):
* runtime/JSLock.h:
(JSC::JSLock::ownerThread):
(JSC::JSLock::currentThreadIsHoldingLock):
* runtime/SamplingProfiler.cpp:
(JSC::FrameWalker::isValidFramePointer):
(JSC::SamplingProfiler::SamplingProfiler):
(JSC::SamplingProfiler::createThreadIfNecessary):
(JSC::SamplingProfiler::takeSample):
* runtime/SamplingProfiler.h:
* runtime/VM.h:
(JSC::VM::ownerThread):
* runtime/VMTraps.cpp:
(JSC::findActiveVMAndStackBounds):
(JSC::VMTraps::SignalSender::send):
(JSC::VMTraps::fireTrap):
Source/WebCore:
Mechanical change. Use Thread:: APIs.
* Modules/indexeddb/server/IDBServer.cpp:
(WebCore::IDBServer::IDBServer::IDBServer):
* Modules/indexeddb/server/IDBServer.h:
* Modules/webaudio/AsyncAudioDecoder.cpp:
(WebCore::AsyncAudioDecoder::AsyncAudioDecoder):
(WebCore::AsyncAudioDecoder::~AsyncAudioDecoder):
(WebCore::AsyncAudioDecoder::runLoop):
* Modules/webaudio/AsyncAudioDecoder.h:
* Modules/webaudio/OfflineAudioDestinationNode.cpp:
(WebCore::OfflineAudioDestinationNode::OfflineAudioDestinationNode):
(WebCore::OfflineAudioDestinationNode::uninitialize):
(WebCore::OfflineAudioDestinationNode::startRendering):
* Modules/webaudio/OfflineAudioDestinationNode.h:
* Modules/webdatabase/Database.cpp:
(WebCore::Database::securityOrigin):
* Modules/webdatabase/DatabaseThread.cpp:
(WebCore::DatabaseThread::start):
(WebCore::DatabaseThread::databaseThread):
(WebCore::DatabaseThread::recordDatabaseOpen):
(WebCore::DatabaseThread::recordDatabaseClosed):
* Modules/webdatabase/DatabaseThread.h:
(WebCore::DatabaseThread::getThreadID):
* bindings/js/GCController.cpp:
(WebCore::GCController::garbageCollectOnAlternateThreadForDebugging):
* fileapi/AsyncFileStream.cpp:
(WebCore::callOnFileThread):
* loader/icon/IconDatabase.cpp:
(WebCore::IconDatabase::open):
(WebCore::IconDatabase::close):
* loader/icon/IconDatabase.h:
* page/ResourceUsageThread.cpp:
(WebCore::ResourceUsageThread::createThreadIfNeeded):
* page/ResourceUsageThread.h:
* page/scrolling/ScrollingThread.cpp:
(WebCore::ScrollingThread::ScrollingThread):
(WebCore::ScrollingThread::isCurrentThread):
(WebCore::ScrollingThread::createThreadIfNeeded):
(WebCore::ScrollingThread::threadCallback):
* page/scrolling/ScrollingThread.h:
* platform/audio/HRTFDatabaseLoader.cpp:
(WebCore::HRTFDatabaseLoader::HRTFDatabaseLoader):
(WebCore::HRTFDatabaseLoader::loadAsynchronously):
(WebCore::HRTFDatabaseLoader::waitForLoaderThreadCompletion):
* platform/audio/HRTFDatabaseLoader.h:
* platform/audio/ReverbConvolver.cpp:
(WebCore::ReverbConvolver::ReverbConvolver):
(WebCore::ReverbConvolver::~ReverbConvolver):
* platform/audio/ReverbConvolver.h:
* platform/audio/gstreamer/AudioFileReaderGStreamer.cpp:
(WebCore::createBusFromAudioFile):
(WebCore::createBusFromInMemoryAudioFile):
* platform/graphics/gstreamer/WebKitWebSourceGStreamer.cpp:
(ResourceHandleStreamingClient::ResourceHandleStreamingClient):
(ResourceHandleStreamingClient::~ResourceHandleStreamingClient):
* platform/network/cf/LoaderRunLoopCF.cpp:
(WebCore::loaderRunLoop):
* platform/network/curl/CurlDownload.cpp:
(WebCore::CurlDownloadManager::startThreadIfNeeded):
(WebCore::CurlDownloadManager::stopThread):
* platform/network/curl/CurlDownload.h:
* platform/network/curl/SocketStreamHandleImpl.h:
* platform/network/curl/SocketStreamHandleImplCurl.cpp:
(WebCore::SocketStreamHandleImpl::startThread):
(WebCore::SocketStreamHandleImpl::stopThread):
* workers/WorkerThread.cpp:
(WebCore::WorkerThread::WorkerThread):
(WebCore::WorkerThread::start):
(WebCore::WorkerThread::workerThread):
* workers/WorkerThread.h:
(WebCore::WorkerThread::threadID):
Source/WebKit:
Mechanical change. Use Thread:: APIs.
* Storage/StorageThread.cpp:
(WebCore::StorageThread::StorageThread):
(WebCore::StorageThread::~StorageThread):
(WebCore::StorageThread::start):
(WebCore::StorageThread::dispatch):
(WebCore::StorageThread::terminate):
* Storage/StorageThread.h:
Source/WebKit2:
Mechanical change. Use Thread:: APIs.
* NetworkProcess/NetworkProcess.cpp:
(WebKit::NetworkProcess::initializeNetworkProcess):
* NetworkProcess/cache/NetworkCacheIOChannelSoup.cpp:
(WebKit::NetworkCache::IOChannel::readSyncInThread):
* Platform/IPC/Connection.cpp:
(IPC::Connection::processIncomingMessage):
* Shared/EntryPointUtilities/mac/XPCService/XPCServiceEntryPoint.h:
(WebKit::XPCServiceInitializer):
* UIProcess/linux/MemoryPressureMonitor.cpp:
(WebKit::MemoryPressureMonitor::MemoryPressureMonitor):
* WebProcess/WebProcess.cpp:
(WebKit::WebProcess::initializeWebProcess):
Source/WTF:
This patch is refactoring of WTF Threading mechanism to merge JavaScriptCore's threading extension to WTF Threading.
Previously, JavaScriptCore requires richer threading features (such as suspending and resuming threads), and they
are implemented for PlatformThread in JavaScriptCore. But these features should be implemented in WTF Threading side
instead of maintaining JSC's own threading features too. This patch removes these features from JSC and move it to
WTF Threading.
However, current WTF Threading has one problem: Windows version of WTF Threading has different semantics from Pthreads
one. In Windows WTF Threading, we cannot perform any operation after the target thread is detached: WTF Threading stop
tracking the state of the thread once the thread is detached. But this is not the same to Pthreads one. In Pthreads,
pthread_detach just means that the resource of the thread will be destroyed automatically. While some operations like
pthread_join will be rejected, some operations like pthread_kill will be accepted.
The problem is that detached thread can be suspended and resumed in JSC. For example, in jsc.cpp, we start the worker
thread and detach it immediately. In worker thread, we will create VM and thus concurrent GC will suspend and resume
the detached thread. However, in Windows WTF Threading, we have no reference to the detached thread. Thus we cannot
perform suspend and resume operations onto the detached thread.
To solve the problem, we change Windows Threading mechanism drastically to align it to the Pthread semantics. In the
new Threading, we have RefPtr<Thread> class. It holds a handle to a platform thread. We can perform threading operations
with this class. For example, Thread::suspend is offered. And we use destructor of the thread local variable to release
the resources held by RefPtr<Thread>. In Windows, Thread::detach does nothing because the resource will be destroyed
automatically by RefPtr<Thread>.
To do so, we introduce ThreadHolder for Windows. This is similar to the previous ThreadIdentifierData for Pthreads.
It holds RefPtr<Thread> in the thread local storage (technically, it is Fiber Local Storage in Windows). Thread::current()
will return this reference.
The problematic situation is that the order of the deallocation of the thread local storage is not defined. So we should
not touch thread local storage in the destructor of the thread local storage. To avoid such edge cases, we have
currentThread() / Thread::currentID() APIs. They are safe to be called even in the destructor of the other thread local
storage. And in Windows, in the FLS destructor, we will create the thread_local variable to defer the destruction of
the ThreadHolder. We ensure that this destructor is called after the other FLS destructors are called in Windows 10.
This patch is performance neutral.
* WTF.xcodeproj/project.pbxproj:
* benchmarks/ConditionSpeedTest.cpp:
* benchmarks/LockFairnessTest.cpp:
* benchmarks/LockSpeedTest.cpp:
* wtf/AutomaticThread.cpp:
(WTF::AutomaticThread::start):
* wtf/CMakeLists.txt:
* wtf/MainThread.h:
* wtf/MemoryPressureHandler.h:
* wtf/ParallelJobsGeneric.cpp:
(WTF::ParallelEnvironment::ThreadPrivate::tryLockFor):
(WTF::ParallelEnvironment::ThreadPrivate::workerThread):
* wtf/ParallelJobsGeneric.h:
(WTF::ParallelEnvironment::ThreadPrivate::ThreadPrivate): Deleted.
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::forEachImpl):
* wtf/ParkingLot.h:
(WTF::ParkingLot::forEach):
* wtf/PlatformRegisters.h: Renamed from Source/JavaScriptCore/runtime/PlatformThread.h.
* wtf/RefPtr.h:
(WTF::RefPtr::RefPtr):
* wtf/ThreadFunctionInvocation.h:
(WTF::ThreadFunctionInvocation::ThreadFunctionInvocation):
* wtf/ThreadHolder.cpp: Added.
(WTF::ThreadHolder::~ThreadHolder):
(WTF::ThreadHolder::initialize):
* wtf/ThreadHolder.h: Renamed from Source/WTF/wtf/ThreadIdentifierDataPthreads.h.
(WTF::ThreadHolder::thread):
(WTF::ThreadHolder::ThreadHolder):
* wtf/ThreadHolderPthreads.cpp: Renamed from Source/WTF/wtf/ThreadIdentifierDataPthreads.cpp.
(WTF::ThreadHolder::initializeOnce):
(WTF::ThreadHolder::current):
(WTF::ThreadHolder::destruct):
* wtf/ThreadHolderWin.cpp: Added.
(WTF::threadMapMutex):
(WTF::threadMap):
(WTF::ThreadHolder::initializeOnce):
(WTF::ThreadHolder::current):
(WTF::ThreadHolder::destruct):
* wtf/ThreadSpecific.h:
* wtf/Threading.cpp:
(WTF::Thread::normalizeThreadName):
(WTF::threadEntryPoint):
(WTF::Thread::create):
(WTF::Thread::setCurrentThreadIsUserInteractive):
(WTF::Thread::setCurrentThreadIsUserInitiated):
(WTF::Thread::setGlobalMaxQOSClass):
(WTF::Thread::adjustedQOSClass):
(WTF::Thread::dump):
(WTF::initializeThreading):
(WTF::normalizeThreadName): Deleted.
(WTF::createThread): Deleted.
(WTF::setCurrentThreadIsUserInteractive): Deleted.
(WTF::setCurrentThreadIsUserInitiated): Deleted.
(WTF::setGlobalMaxQOSClass): Deleted.
(WTF::adjustedQOSClass): Deleted.
* wtf/Threading.h:
(WTF::Thread::id):
(WTF::Thread::operator==):
(WTF::Thread::operator!=):
(WTF::Thread::joinableState):
(WTF::Thread::didBecomeDetached):
(WTF::Thread::didJoin):
(WTF::Thread::hasExited):
(WTF::currentThread):
* wtf/ThreadingPthreads.cpp:
(WTF::Thread::Thread):
(WTF::Thread::~Thread):
(WTF::Thread::signalHandlerSuspendResume):
(WTF::Thread::initializePlatformThreading):
(WTF::initializeCurrentThreadEvenIfNonWTFCreated):
(WTF::wtfThreadEntryPoint):
(WTF::Thread::createInternal):
(WTF::Thread::initializeCurrentThreadInternal):
(WTF::Thread::changePriority):
(WTF::Thread::waitForCompletion):
(WTF::Thread::detach):
(WTF::Thread::current):
(WTF::Thread::currentID):
(WTF::Thread::signal):
(WTF::Thread::resume):
(WTF::Thread::getRegisters):
(WTF::Thread::didExit):
(WTF::Thread::establish):
(WTF::PthreadState::PthreadState): Deleted.
(WTF::PthreadState::joinableState): Deleted.
(WTF::PthreadState::pthreadHandle): Deleted.
(WTF::PthreadState::didBecomeDetached): Deleted.
(WTF::PthreadState::didExit): Deleted.
(WTF::PthreadState::didJoin): Deleted.
(WTF::PthreadState::hasExited): Deleted.
(WTF::threadMapMutex): Deleted.
(WTF::initializeThreading): Deleted.
(WTF::threadMap): Deleted.
(WTF::identifierByPthreadHandle): Deleted.
(WTF::establishIdentifierForPthreadHandle): Deleted.
(WTF::pthreadHandleForIdentifierWithLockAlreadyHeld): Deleted.
(WTF::createThreadInternal): Deleted.
(WTF::initializeCurrentThreadInternal): Deleted.
(WTF::changeThreadPriority): Deleted.
(WTF::waitForThreadCompletion): Deleted.
(WTF::detachThread): Deleted.
(WTF::threadDidExit): Deleted.
(WTF::currentThread): Deleted.
(WTF::signalThread): Deleted.
* wtf/ThreadingWin.cpp:
(WTF::Thread::Thread):
(WTF::Thread::~Thread):
(WTF::Thread::initializeCurrentThreadInternal):
(WTF::Thread::initializePlatformThreading):
(WTF::wtfThreadEntryPoint):
(WTF::Thread::createInternal):
(WTF::Thread::changePriority):
(WTF::Thread::waitForCompletion):
(WTF::Thread::detach):
(WTF::Thread::resume):
(WTF::Thread::getRegisters):
(WTF::Thread::current):
(WTF::Thread::currentID):
(WTF::Thread::didExit):
(WTF::Thread::establish):
(WTF::initializeCurrentThreadInternal): Deleted.
(WTF::threadMapMutex): Deleted.
(WTF::initializeThreading): Deleted.
(WTF::threadMap): Deleted.
(WTF::storeThreadHandleByIdentifier): Deleted.
(WTF::threadHandleForIdentifier): Deleted.
(WTF::clearThreadHandleForIdentifier): Deleted.
(WTF::createThreadInternal): Deleted.
(WTF::changeThreadPriority): Deleted.
(WTF::waitForThreadCompletion): Deleted.
(WTF::detachThread): Deleted.
(WTF::currentThread): Deleted.
* wtf/WorkQueue.cpp:
(WTF::WorkQueue::concurrentApply):
* wtf/WorkQueue.h:
* wtf/cocoa/WorkQueueCocoa.cpp:
(WTF::dispatchQOSClass):
* wtf/generic/WorkQueueGeneric.cpp:
(WorkQueue::platformInitialize):
(WorkQueue::platformInvalidate):
* wtf/linux/MemoryPressureHandlerLinux.cpp:
(WTF::MemoryPressureHandler::EventFDPoller::EventFDPoller):
(WTF::MemoryPressureHandler::EventFDPoller::~EventFDPoller):
* wtf/win/MainThreadWin.cpp:
(WTF::initializeMainThreadPlatform):
Tools:
Mechanical change. Use Thread:: APIs.
* DumpRenderTree/JavaScriptThreading.cpp:
(runJavaScriptThread):
(startJavaScriptThreads):
(stopJavaScriptThreads):
* DumpRenderTree/mac/DumpRenderTree.mm:
(testThreadIdentifierMap):
* TestWebKitAPI/Tests/WTF/Condition.cpp:
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::runLockTest):
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/187690@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@215265 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-04-12 12:08:29 +00:00
: thread ( Thread : : current ( ) )
2015-08-11 19:51:35 +00:00
{
unsigned currentNumThreads ;
for ( ; ; ) {
unsigned oldNumThreads = numThreads . load ( ) ;
currentNumThreads = oldNumThreads + 1 ;
if ( numThreads . compareExchangeWeak ( oldNumThreads , currentNumThreads ) )
break ;
}
ensureHashtableSize ( currentNumThreads ) ;
}
ThreadData : : ~ ThreadData ( )
{
for ( ; ; ) {
unsigned oldNumThreads = numThreads . load ( ) ;
if ( numThreads . compareExchangeWeak ( oldNumThreads , oldNumThreads - 1 ) )
break ;
}
}
ThreadData * myThreadData ( )
{
The GC should be in a thread
https://bugs.webkit.org/show_bug.cgi?id=163562
Reviewed by Geoffrey Garen and Andreas Kling.
Source/JavaScriptCore:
In a concurrent GC, the work of collecting happens on a separate thread. This patch
implements this, and schedules the thread the way that a concurrent GC thread would be
scheduled. But, the GC isn't actually concurrent yet because it calls stopTheWorld() before
doing anything and calls resumeTheWorld() after it's done with everything. The next step will
be to make it really concurrent by basically calling stopTheWorld()/resumeTheWorld() around
bounded snippets of work while making most of the work happen with the world running. Our GC
will probably always have stop-the-world phases because the semantics of JSC weak references
call for it.
This implements concurrent GC scheduling. This means that there is no longer a
Heap::collect() API. Instead, you can call collectAsync() which makes sure that a GC is
scheduled (it will do nothing if one is scheduled or ongoing) or you can call collectSync()
to schedule a GC and wait for it to happen. I made our debugging stuff call collectSync().
It should be a goal to never call collectSync() except for debugging or benchmark harness
hacks.
The collector thread is an AutomaticThread, so it won't linger when not in use. It works on
a ticket-based system, like you would see at the DMV. A ticket is a 64-bit integer. There are
two ticket counters: last granted and last served. When you request a collection, last
granted is incremented and its new value given to you. When a collection completes, last
served is incremented. collectSync() waits until last served catches up to what last granted
had been at the time you requested a GC. This means that if you request a sync GC in the
middle of an async GC, you will wait for that async GC to finish and then you will request
and wait for your sync GC.
The synchronization between the collector thread and the main threads is complex. The
collector thread needs to be able to ask the main thread to stop. It needs to be able to do
some post-GC clean-up, like the synchronous CodeBlock and LargeAllocation sweeps, on the main
thread. The collector needs to be able to ask the main thread to execute a cross-modifying
code fence before running any JIT code, since the GC might aid the JIT worklist and run JIT
finalization. It's possible for the GC to want the main thread to run something at the same
time that the main thread wants to wait for the GC. The main thread needs to be able to run
non-JSC stuff without causing the GC to completely stall. The main thread needs to be able
to query its own state (is there a request to stop?) and change it (running JSC versus not)
quickly, since this may happen on hot paths. This kind of intertwined system of requests,
notifications, and state changes requires a combination of lock-free algorithms and waiting.
So, this is all implemented using a Atomic<unsigned> Heap::m_worldState, which has bits to
represent things being requested by the collector and the heap access state of the mutator. I
am borrowing a lot of terms that I've seen in other VMs that I've worked on. Here's what they
mean:
- Stop the world: make sure that either the mutator is not running, or that it's not running
code that could mess with the heap.
- Heap access: the mutator is said to have heap access if it could mess with the heap.
If you stop the world and the mutator doesn't have heap access, all you're doing is making
sure that it will block when it tries to acquire heap access. This means that our GC is
already fully concurrent in cases where the GC is requested while the mutator has no heap
access. This probably won't happen, but if it did then it should just work. Usually, stopping
the world means that we state our shouldStop request with m_worldState, and a future call
to Heap::stopIfNecessary() will go to slow path and stop. The act of stopping or waiting to
acquire heap access is managed by using ParkingLot API directly on m_worldState. This works
out great because it would be very awkward to get the same functionality using locks and
condition variables, since we want stopIfNecessary/acquireAccess/requestAccess fast paths
that are single atomic instructions (load/CAS/CAS, respectively). The mutator will call these
things frequently. Currently we have Heap::stopIfNecessary() polling on every allocator slow
path, but we may want to make it even more frequent than that.
Currently only JSC API clients benefit from the heap access optimization. The DOM forces us
to assume that heap access is permanently on, since DOM manipulation doesn't always hold the
JSLock. We could still allow the GC to proceed when the runloop is idle by having the GC put
a task on the runloop that just calls stopIfNecessary().
This is perf neutral. The only behavior change that clients ought to observe is that marking
and the weak fixpoint happen on a separate thread. Marking was already parallel so it already
handled multiple threads, but now it _never_ runs on the main thread. The weak fixpoint
needed some help to be able to run on another thread - mostly because there was some code in
IndexedDB that was using thread specifics in the weak fixpoint.
* API/JSBase.cpp:
(JSSynchronousEdenCollectForDebugging):
* API/JSManagedValue.mm:
(-[JSManagedValue initWithValue:]):
* heap/EdenGCActivityCallback.cpp:
(JSC::EdenGCActivityCallback::doCollection):
* heap/FullGCActivityCallback.cpp:
(JSC::FullGCActivityCallback::doCollection):
* heap/Heap.cpp:
(JSC::Heap::Thread::Thread):
(JSC::Heap::Heap):
(JSC::Heap::lastChanceToFinalize):
(JSC::Heap::markRoots):
(JSC::Heap::gatherStackRoots):
(JSC::Heap::deleteUnmarkedCompiledCode):
(JSC::Heap::collectAllGarbage):
(JSC::Heap::collectAsync):
(JSC::Heap::collectSync):
(JSC::Heap::shouldCollectInThread):
(JSC::Heap::collectInThread):
(JSC::Heap::stopTheWorld):
(JSC::Heap::resumeTheWorld):
(JSC::Heap::stopIfNecessarySlow):
(JSC::Heap::acquireAccessSlow):
(JSC::Heap::releaseAccessSlow):
(JSC::Heap::handleDidJIT):
(JSC::Heap::handleNeedFinalize):
(JSC::Heap::setDidJIT):
(JSC::Heap::setNeedFinalize):
(JSC::Heap::waitWhileNeedFinalize):
(JSC::Heap::finalize):
(JSC::Heap::requestCollection):
(JSC::Heap::waitForCollection):
(JSC::Heap::didFinishCollection):
(JSC::Heap::canCollect):
(JSC::Heap::shouldCollectHeuristic):
(JSC::Heap::shouldCollect):
(JSC::Heap::collectIfNecessaryOrDefer):
(JSC::Heap::collectAccordingToDeferGCProbability):
(JSC::Heap::collect): Deleted.
(JSC::Heap::collectWithoutAnySweep): Deleted.
(JSC::Heap::collectImpl): Deleted.
* heap/Heap.h:
(JSC::Heap::ReleaseAccessScope::ReleaseAccessScope):
(JSC::Heap::ReleaseAccessScope::~ReleaseAccessScope):
* heap/HeapInlines.h:
(JSC::Heap::acquireAccess):
(JSC::Heap::releaseAccess):
(JSC::Heap::stopIfNecessary):
* heap/MachineStackMarker.cpp:
(JSC::MachineThreads::gatherConservativeRoots):
(JSC::MachineThreads::gatherFromCurrentThread): Deleted.
* heap/MachineStackMarker.h:
* jit/JITWorklist.cpp:
(JSC::JITWorklist::completeAllForVM):
* jit/JITWorklist.h:
* jsc.cpp:
(functionFullGC):
(functionEdenGC):
* runtime/InitializeThreading.cpp:
(JSC::initializeThreading):
* runtime/JSLock.cpp:
(JSC::JSLock::didAcquireLock):
(JSC::JSLock::unlock):
(JSC::JSLock::willReleaseLock):
* tools/JSDollarVMPrototype.cpp:
(JSC::JSDollarVMPrototype::edenGC):
Source/WebCore:
No new tests because existing tests cover this.
We now need to be more careful about using JSLock. This fixes some places that were not
holding it. New assertions in the GC are more likely to catch this than before.
* bindings/js/WorkerScriptController.cpp:
(WebCore::WorkerScriptController::WorkerScriptController):
Source/WTF:
This fixes some bugs and adds a few features.
* wtf/Atomics.h: The GC may do work on behalf of the JIT. If it does, the main thread needs to execute a cross-modifying code fence. This is cpuid on x86 and I believe it's isb on ARM. It would have been an isync on PPC and I think that isb is the ARM equivalent.
(WTF::arm_isb):
(WTF::crossModifyingCodeFence):
(WTF::x86_ortop):
(WTF::x86_cpuid):
* wtf/AutomaticThread.cpp: I accidentally had AutomaticThreadCondition inherit from ThreadSafeRefCounted<AutomaticThread> [sic]. This never crashed before because all of our prior AutomaticThreadConditions were immortal.
(WTF::AutomaticThread::AutomaticThread):
(WTF::AutomaticThread::~AutomaticThread):
(WTF::AutomaticThread::start):
* wtf/AutomaticThread.h:
* wtf/MainThread.cpp: Need to allow initializeGCThreads() to be called separately because it's now more than just a debugging thing.
(WTF::initializeGCThreads):
Canonical link: https://commits.webkit.org/182065@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208306 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-02 22:01:04 +00:00
static ThreadSpecific < RefPtr < ThreadData > , CanBeGCThread : : True > * threadData ;
2015-08-11 19:51:35 +00:00
static std : : once_flag initializeOnce ;
std : : call_once (
initializeOnce ,
[ ] {
The GC should be in a thread
https://bugs.webkit.org/show_bug.cgi?id=163562
Reviewed by Geoffrey Garen and Andreas Kling.
Source/JavaScriptCore:
In a concurrent GC, the work of collecting happens on a separate thread. This patch
implements this, and schedules the thread the way that a concurrent GC thread would be
scheduled. But, the GC isn't actually concurrent yet because it calls stopTheWorld() before
doing anything and calls resumeTheWorld() after it's done with everything. The next step will
be to make it really concurrent by basically calling stopTheWorld()/resumeTheWorld() around
bounded snippets of work while making most of the work happen with the world running. Our GC
will probably always have stop-the-world phases because the semantics of JSC weak references
call for it.
This implements concurrent GC scheduling. This means that there is no longer a
Heap::collect() API. Instead, you can call collectAsync() which makes sure that a GC is
scheduled (it will do nothing if one is scheduled or ongoing) or you can call collectSync()
to schedule a GC and wait for it to happen. I made our debugging stuff call collectSync().
It should be a goal to never call collectSync() except for debugging or benchmark harness
hacks.
The collector thread is an AutomaticThread, so it won't linger when not in use. It works on
a ticket-based system, like you would see at the DMV. A ticket is a 64-bit integer. There are
two ticket counters: last granted and last served. When you request a collection, last
granted is incremented and its new value given to you. When a collection completes, last
served is incremented. collectSync() waits until last served catches up to what last granted
had been at the time you requested a GC. This means that if you request a sync GC in the
middle of an async GC, you will wait for that async GC to finish and then you will request
and wait for your sync GC.
The synchronization between the collector thread and the main threads is complex. The
collector thread needs to be able to ask the main thread to stop. It needs to be able to do
some post-GC clean-up, like the synchronous CodeBlock and LargeAllocation sweeps, on the main
thread. The collector needs to be able to ask the main thread to execute a cross-modifying
code fence before running any JIT code, since the GC might aid the JIT worklist and run JIT
finalization. It's possible for the GC to want the main thread to run something at the same
time that the main thread wants to wait for the GC. The main thread needs to be able to run
non-JSC stuff without causing the GC to completely stall. The main thread needs to be able
to query its own state (is there a request to stop?) and change it (running JSC versus not)
quickly, since this may happen on hot paths. This kind of intertwined system of requests,
notifications, and state changes requires a combination of lock-free algorithms and waiting.
So, this is all implemented using a Atomic<unsigned> Heap::m_worldState, which has bits to
represent things being requested by the collector and the heap access state of the mutator. I
am borrowing a lot of terms that I've seen in other VMs that I've worked on. Here's what they
mean:
- Stop the world: make sure that either the mutator is not running, or that it's not running
code that could mess with the heap.
- Heap access: the mutator is said to have heap access if it could mess with the heap.
If you stop the world and the mutator doesn't have heap access, all you're doing is making
sure that it will block when it tries to acquire heap access. This means that our GC is
already fully concurrent in cases where the GC is requested while the mutator has no heap
access. This probably won't happen, but if it did then it should just work. Usually, stopping
the world means that we state our shouldStop request with m_worldState, and a future call
to Heap::stopIfNecessary() will go to slow path and stop. The act of stopping or waiting to
acquire heap access is managed by using ParkingLot API directly on m_worldState. This works
out great because it would be very awkward to get the same functionality using locks and
condition variables, since we want stopIfNecessary/acquireAccess/requestAccess fast paths
that are single atomic instructions (load/CAS/CAS, respectively). The mutator will call these
things frequently. Currently we have Heap::stopIfNecessary() polling on every allocator slow
path, but we may want to make it even more frequent than that.
Currently only JSC API clients benefit from the heap access optimization. The DOM forces us
to assume that heap access is permanently on, since DOM manipulation doesn't always hold the
JSLock. We could still allow the GC to proceed when the runloop is idle by having the GC put
a task on the runloop that just calls stopIfNecessary().
This is perf neutral. The only behavior change that clients ought to observe is that marking
and the weak fixpoint happen on a separate thread. Marking was already parallel so it already
handled multiple threads, but now it _never_ runs on the main thread. The weak fixpoint
needed some help to be able to run on another thread - mostly because there was some code in
IndexedDB that was using thread specifics in the weak fixpoint.
* API/JSBase.cpp:
(JSSynchronousEdenCollectForDebugging):
* API/JSManagedValue.mm:
(-[JSManagedValue initWithValue:]):
* heap/EdenGCActivityCallback.cpp:
(JSC::EdenGCActivityCallback::doCollection):
* heap/FullGCActivityCallback.cpp:
(JSC::FullGCActivityCallback::doCollection):
* heap/Heap.cpp:
(JSC::Heap::Thread::Thread):
(JSC::Heap::Heap):
(JSC::Heap::lastChanceToFinalize):
(JSC::Heap::markRoots):
(JSC::Heap::gatherStackRoots):
(JSC::Heap::deleteUnmarkedCompiledCode):
(JSC::Heap::collectAllGarbage):
(JSC::Heap::collectAsync):
(JSC::Heap::collectSync):
(JSC::Heap::shouldCollectInThread):
(JSC::Heap::collectInThread):
(JSC::Heap::stopTheWorld):
(JSC::Heap::resumeTheWorld):
(JSC::Heap::stopIfNecessarySlow):
(JSC::Heap::acquireAccessSlow):
(JSC::Heap::releaseAccessSlow):
(JSC::Heap::handleDidJIT):
(JSC::Heap::handleNeedFinalize):
(JSC::Heap::setDidJIT):
(JSC::Heap::setNeedFinalize):
(JSC::Heap::waitWhileNeedFinalize):
(JSC::Heap::finalize):
(JSC::Heap::requestCollection):
(JSC::Heap::waitForCollection):
(JSC::Heap::didFinishCollection):
(JSC::Heap::canCollect):
(JSC::Heap::shouldCollectHeuristic):
(JSC::Heap::shouldCollect):
(JSC::Heap::collectIfNecessaryOrDefer):
(JSC::Heap::collectAccordingToDeferGCProbability):
(JSC::Heap::collect): Deleted.
(JSC::Heap::collectWithoutAnySweep): Deleted.
(JSC::Heap::collectImpl): Deleted.
* heap/Heap.h:
(JSC::Heap::ReleaseAccessScope::ReleaseAccessScope):
(JSC::Heap::ReleaseAccessScope::~ReleaseAccessScope):
* heap/HeapInlines.h:
(JSC::Heap::acquireAccess):
(JSC::Heap::releaseAccess):
(JSC::Heap::stopIfNecessary):
* heap/MachineStackMarker.cpp:
(JSC::MachineThreads::gatherConservativeRoots):
(JSC::MachineThreads::gatherFromCurrentThread): Deleted.
* heap/MachineStackMarker.h:
* jit/JITWorklist.cpp:
(JSC::JITWorklist::completeAllForVM):
* jit/JITWorklist.h:
* jsc.cpp:
(functionFullGC):
(functionEdenGC):
* runtime/InitializeThreading.cpp:
(JSC::initializeThreading):
* runtime/JSLock.cpp:
(JSC::JSLock::didAcquireLock):
(JSC::JSLock::unlock):
(JSC::JSLock::willReleaseLock):
* tools/JSDollarVMPrototype.cpp:
(JSC::JSDollarVMPrototype::edenGC):
Source/WebCore:
No new tests because existing tests cover this.
We now need to be more careful about using JSLock. This fixes some places that were not
holding it. New assertions in the GC are more likely to catch this than before.
* bindings/js/WorkerScriptController.cpp:
(WebCore::WorkerScriptController::WorkerScriptController):
Source/WTF:
This fixes some bugs and adds a few features.
* wtf/Atomics.h: The GC may do work on behalf of the JIT. If it does, the main thread needs to execute a cross-modifying code fence. This is cpuid on x86 and I believe it's isb on ARM. It would have been an isync on PPC and I think that isb is the ARM equivalent.
(WTF::arm_isb):
(WTF::crossModifyingCodeFence):
(WTF::x86_ortop):
(WTF::x86_cpuid):
* wtf/AutomaticThread.cpp: I accidentally had AutomaticThreadCondition inherit from ThreadSafeRefCounted<AutomaticThread> [sic]. This never crashed before because all of our prior AutomaticThreadConditions were immortal.
(WTF::AutomaticThread::AutomaticThread):
(WTF::AutomaticThread::~AutomaticThread):
(WTF::AutomaticThread::start):
* wtf/AutomaticThread.h:
* wtf/MainThread.cpp: Need to allow initializeGCThreads() to be called separately because it's now more than just a debugging thing.
(WTF::initializeGCThreads):
Canonical link: https://commits.webkit.org/182065@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208306 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-02 22:01:04 +00:00
threadData = new ThreadSpecific < RefPtr < ThreadData > , CanBeGCThread : : True > ( ) ;
2015-08-11 19:51:35 +00:00
} ) ;
2016-09-13 15:59:25 +00:00
RefPtr < ThreadData > & result = * * threadData ;
if ( ! result )
result = adoptRef ( new ThreadData ( ) ) ;
return result . get ( ) ;
2015-08-11 19:51:35 +00:00
}
template < typename Functor >
bool enqueue ( const void * address , const Functor & functor )
{
unsigned hash = hashAddress ( address ) ;
for ( ; ; ) {
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
Hashtable * myHashtable = ensureHashtable ( ) ;
2015-08-11 19:51:35 +00:00
unsigned index = hash % myHashtable - > size ;
Atomic < Bucket * > & bucketPointer = myHashtable - > data [ index ] ;
Bucket * bucket ;
for ( ; ; ) {
bucket = bucketPointer . load ( ) ;
if ( ! bucket ) {
bucket = new Bucket ( ) ;
if ( ! bucketPointer . compareExchangeWeak ( nullptr , bucket ) ) {
delete bucket ;
continue ;
}
}
break ;
}
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : enqueueing onto bucket " , RawPointer ( bucket ) , " with index " , index , " for address " , RawPointer ( address ) , " with hash " , hash , " \n " ) ) ;
2015-08-11 19:51:35 +00:00
bucket - > lock . lock ( ) ;
// At this point the hashtable could have rehashed under us.
if ( hashtable . load ( ) ! = myHashtable ) {
bucket - > lock . unlock ( ) ;
continue ;
}
ThreadData * threadData = functor ( ) ;
bool result ;
if ( threadData ) {
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : proceeding to enqueue " , RawPointer ( threadData ) , " \n " ) ) ;
2015-08-11 19:51:35 +00:00
bucket - > enqueue ( threadData ) ;
result = true ;
} else
result = false ;
bucket - > lock . unlock ( ) ;
return result ;
}
}
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
enum class BucketMode {
EnsureNonEmpty ,
IgnoreEmpty
} ;
template < typename DequeueFunctor , typename FinishFunctor >
bool dequeue (
const void * address , BucketMode bucketMode , const DequeueFunctor & dequeueFunctor ,
const FinishFunctor & finishFunctor )
2015-08-11 19:51:35 +00:00
{
unsigned hash = hashAddress ( address ) ;
for ( ; ; ) {
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
Hashtable * myHashtable = ensureHashtable ( ) ;
2015-08-11 19:51:35 +00:00
unsigned index = hash % myHashtable - > size ;
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
Atomic < Bucket * > & bucketPointer = myHashtable - > data [ index ] ;
Bucket * bucket = bucketPointer . load ( ) ;
2015-08-11 19:51:35 +00:00
if ( ! bucket ) {
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
if ( bucketMode = = BucketMode : : IgnoreEmpty )
return false ;
for ( ; ; ) {
bucket = bucketPointer . load ( ) ;
if ( ! bucket ) {
bucket = new Bucket ( ) ;
if ( ! bucketPointer . compareExchangeWeak ( nullptr , bucket ) ) {
delete bucket ;
continue ;
}
}
break ;
}
2015-08-11 19:51:35 +00:00
}
bucket - > lock . lock ( ) ;
// At this point the hashtable could have rehashed under us.
if ( hashtable . load ( ) ! = myHashtable ) {
bucket - > lock . unlock ( ) ;
continue ;
}
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
bucket - > genericDequeue ( dequeueFunctor ) ;
2015-08-11 19:51:35 +00:00
bool result = ! ! bucket - > queueHead ;
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
finishFunctor ( result ) ;
2015-08-11 19:51:35 +00:00
bucket - > lock . unlock ( ) ;
return result ;
}
}
} // anonymous namespace
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
NEVER_INLINE ParkingLot : : ParkResult ParkingLot : : parkConditionallyImpl (
2015-08-13 20:42:11 +00:00
const void * address ,
2016-05-26 21:58:42 +00:00
const ScopedLambda < bool ( ) > & validation ,
const ScopedLambda < void ( ) > & beforeSleep ,
WTF::ParkingLot should stop using std::chrono because std::chrono::duration casts are prone to overflows
https://bugs.webkit.org/show_bug.cgi?id=152045
Reviewed by Andy Estes.
Source/JavaScriptCore:
Probably the nicest example of why this patch is a good idea is the change in
AtomicsObject.cpp.
* jit/ICStats.cpp:
(JSC::ICStats::ICStats):
* runtime/AtomicsObject.cpp:
(JSC::atomicsFuncWait):
Source/WebCore:
No new layout tests because no new behavior. The new WTF time classes have some unit tests
in TestWebKitAPI.
* fileapi/ThreadableBlobRegistry.cpp:
(WebCore::ThreadableBlobRegistry::blobSize):
* platform/MainThreadSharedTimer.h:
* platform/SharedTimer.h:
* platform/ThreadTimers.cpp:
(WebCore::ThreadTimers::updateSharedTimer):
* platform/cf/MainThreadSharedTimerCF.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/efl/MainThreadSharedTimerEfl.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/glib/MainThreadSharedTimerGLib.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/win/MainThreadSharedTimerWin.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* workers/WorkerRunLoop.cpp:
(WebCore::WorkerRunLoop::runInMode):
Source/WebKit2:
* Platform/IPC/Connection.cpp:
(IPC::Connection::SyncMessageState::wait):
(IPC::Connection::sendMessage):
(IPC::Connection::timeoutRespectingIgnoreTimeoutsForTesting):
(IPC::Connection::waitForMessage):
(IPC::Connection::sendSyncMessage):
(IPC::Connection::waitForSyncReply):
* Platform/IPC/Connection.h:
(IPC::Connection::sendSync):
(IPC::Connection::waitForAndDispatchImmediately):
* Platform/IPC/MessageSender.h:
(IPC::MessageSender::sendSync):
* UIProcess/ChildProcessProxy.h:
(WebKit::ChildProcessProxy::sendSync):
* UIProcess/Network/NetworkProcessProxy.cpp:
(WebKit::NetworkProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/Storage/StorageManager.cpp:
(WebKit::StorageManager::applicationWillTerminate):
* UIProcess/WebProcessProxy.cpp:
(WebKit::WebProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/WebResourceLoadStatisticsStore.cpp:
(WebKit::WebResourceLoadStatisticsStore::applicationWillTerminate):
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.h:
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.mm:
(-[WKOneShotDisplayLinkHandler displayLinkFired:]):
(WebKit::RemoteLayerTreeDrawingAreaProxy::commitLayerTree):
(WebKit::RemoteLayerTreeDrawingAreaProxy::didRefreshDisplay):
(WebKit::RemoteLayerTreeDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/TiledCoreAnimationDrawingAreaProxy.mm:
(WebKit::TiledCoreAnimationDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/WKImmediateActionController.mm:
(-[WKImmediateActionController immediateActionRecognizerWillBeginAnimation:]):
* UIProcess/mac/WebPageProxyMac.mm:
(WebKit::WebPageProxy::stringSelectionForPasteboard):
(WebKit::WebPageProxy::dataSelectionForPasteboard):
(WebKit::WebPageProxy::readSelectionFromPasteboard):
(WebKit::WebPageProxy::shouldDelayWindowOrderingForEvent):
(WebKit::WebPageProxy::acceptsFirstMouse):
* WebProcess/WebCoreSupport/WebChromeClient.cpp:
(WebKit::WebChromeClient::runBeforeUnloadConfirmPanel):
(WebKit::WebChromeClient::runJavaScriptAlert):
(WebKit::WebChromeClient::runJavaScriptConfirm):
(WebKit::WebChromeClient::runJavaScriptPrompt):
(WebKit::WebChromeClient::print):
(WebKit::WebChromeClient::exceededDatabaseQuota):
(WebKit::WebChromeClient::reachedApplicationCacheOriginQuota):
* WebProcess/WebCoreSupport/WebFrameLoaderClient.cpp:
(WebKit::WebFrameLoaderClient::dispatchDecidePolicyForResponse):
* WebProcess/WebPage/WebPage.cpp:
(WebKit::WebPage::postSynchronousMessageForTesting):
Source/WTF:
We used to use 'double' for all time measurements. Sometimes it was milliseconds,
sometimes it was seconds. Sometimes we measured a span of time, sometimes we spoke of time
since some epoch. When we spoke of time since epoch, we either used a monotonic clock or
a wall clock. The type - always 'double' - never told us what kind of time we had, even
though there were roughly six of them (sec interval, ms interval, sec since epoch on wall,
ms since epoch on wall, sec since epoch monotonic, ms since epoch monotonic).
At some point, we thought that it would be a good idea to replace these doubles with
std::chrono. But since replacing some things with std::chrono, we found it to be terribly
inconvenient:
- Outrageous API. I never want to say std::chrono::milliseconds(blah). I never want to say
std::chrono::steady_clock::timepoint. The syntax for duration_cast is ugly, and ideally
duration_cast would not even be a thing.
- No overflow protection. std::chrono uses integers by default and using anything else is
clumsy. But the integer math is done without regard for the rough edges of integer math,
so any cast between std::chrono types risks overflow. Any comparison risks overflow
because it may do conversions silently. We have even found bugs where some C++
implementations had more overflows than others, which ends up being a special kind of
hell. In many cases, the overflow also has nasal demons.
It's an error to represent time using integers. It would have been excusable back when
floating point math was not guaranteed to be supported on all platforms, but that would
have been a long time ago. Time is a continuous, infinite concept and it's a perfect fit
for floating point:
- Floating point preserves precision under multiplication in all but extreme cases, so
using floating point for time means that unit conversions are almost completely
lossless. This means that we don't have to think very hard about what units to use. In
this patch, we use seconds almost everywhere. We only convert at boundaries, like an API
boundary that wants something other than seconds.
- Floating point makes it easy to reason about infinity, which is something that time code
wants to do a lot. Example: when would you like to timeout? Infinity please! This is the
most elegant way of having an API support both a timeout variant and a no-timeout
variant.
- Floating point does well-understood things when math goes wrong, and these things are
pretty well optimized to match what a mathematician would do when computing with real
numbers represented using scientific notation with a finite number of significant
digits. This means that time math under floating point looks like normal math. On the
other hand, std::chrono time math looks like garbage because you have to always check
for multiple possible UB corners whenever you touch large integers. Integers that
represent time are very likely to be large and you don't have to do much to overflow
them. At this time, based on the number of bugs we have already seen due to chrono
overflows, I am not certain that we even understand what are all of the corner cases
that we should even check for.
This patch introduces a new set of timekeeping classes that are all based on double, and
all internally use seconds. These classes support algebraic typing. The classes are:
- Seconds: this is for measuring a duration.
- WallTime: time since epoch according to a wall clock (aka real time clock).
- MonotonicTime: time since epoch according to a monotonic clock.
- ClockType: enum that says either Wall or Monotonic.
- TimeWithDynamicClockType: a tuple of double and ClockType, which represents either a
wall time or a monotonic time.
All of these classes behave like C++ values and are cheap to copy around since they are
very nearly POD. This supports comprehensive conversions between the various time types.
Most of this is by way of algebra. Here are just some of the rules we recognize:
WallTime = WallTime + Seconds
Seconds = WallTime - WallTime
MonotonicTime = MonotonicTime + Seconds
etc...
We support negative, infinite, and NaN times because math.
We support conversions between MonotonicTime and WallTime, like:
WallTime wt = mt.approximateWallTime()
This is called this "approximate" because the only way to do it is to get the current time
on both clocks and convert relative to that.
Many of our APIs would be happy using whatever notion of time the user wanted to use. For
those APIs, which includes Condition and ParkingLot, we have TimeWithDynamicClockType. You
can automatically convert WallTime or MonotonicTime to TimeWithDynamicClockType. This
means that if you use a WallTime with Condition::waitUntil, then Condition's internal
logic for when it should wake up makes its decision based on the current WallTime - but if
you use MonotonicTime then waitUntil will make its decision based on current
MonotonicTime. This is a greater level of flexibility than chrono allowed, since chrono
did not have the concept of a dynamic clock type.
This patch does not include conversions between std::chrono and these new time classes,
because past experience shows that we're quite bad at getting conversions between
std::chrono and anything else right. Also, I didn't need such conversion code because this
patch only converts code that transitively touches ParkingLot and Condition. It was easy
to get all of that code onto the new time classes.
* WTF.xcodeproj/project.pbxproj:
* wtf/AutomaticThread.cpp:
(WTF::AutomaticThread::start):
* wtf/CMakeLists.txt:
* wtf/ClockType.cpp: Added.
(WTF::printInternal):
* wtf/ClockType.h: Added.
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::waitFor):
(WTF::ConditionBase::wait):
(WTF::ConditionBase::waitUntilWallClockSeconds): Deleted.
(WTF::ConditionBase::waitUntilMonotonicClockSeconds): Deleted.
(WTF::ConditionBase::waitForSeconds): Deleted.
(WTF::ConditionBase::waitForSecondsImpl): Deleted.
(WTF::ConditionBase::waitForImpl): Deleted.
(WTF::ConditionBase::absoluteFromRelative): Deleted.
* wtf/CrossThreadQueue.h:
(WTF::CrossThreadQueue<DataType>::waitForMessage):
* wtf/CurrentTime.cpp:
(WTF::sleep):
* wtf/MessageQueue.h:
(WTF::MessageQueue::infiniteTime): Deleted.
* wtf/MonotonicTime.cpp: Added.
(WTF::MonotonicTime::now):
(WTF::MonotonicTime::approximateWallTime):
(WTF::MonotonicTime::dump):
(WTF::MonotonicTime::sleep):
* wtf/MonotonicTime.h: Added.
(WTF::MonotonicTime::MonotonicTime):
(WTF::MonotonicTime::fromRawDouble):
(WTF::MonotonicTime::infinity):
(WTF::MonotonicTime::secondsSinceEpoch):
(WTF::MonotonicTime::approximateMonotonicTime):
(WTF::MonotonicTime::operator bool):
(WTF::MonotonicTime::operator+):
(WTF::MonotonicTime::operator-):
(WTF::MonotonicTime::operator+=):
(WTF::MonotonicTime::operator-=):
(WTF::MonotonicTime::operator==):
(WTF::MonotonicTime::operator!=):
(WTF::MonotonicTime::operator<):
(WTF::MonotonicTime::operator>):
(WTF::MonotonicTime::operator<=):
(WTF::MonotonicTime::operator>=):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkCount):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
* wtf/Seconds.cpp: Added.
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::dump):
(WTF::Seconds::sleep):
* wtf/Seconds.h: Added.
(WTF::Seconds::Seconds):
(WTF::Seconds::value):
(WTF::Seconds::seconds):
(WTF::Seconds::milliseconds):
(WTF::Seconds::microseconds):
(WTF::Seconds::nanoseconds):
(WTF::Seconds::fromMilliseconds):
(WTF::Seconds::fromMicroseconds):
(WTF::Seconds::fromNanoseconds):
(WTF::Seconds::infinity):
(WTF::Seconds::operator bool):
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::operator*):
(WTF::Seconds::operator/):
(WTF::Seconds::operator+=):
(WTF::Seconds::operator-=):
(WTF::Seconds::operator*=):
(WTF::Seconds::operator/=):
(WTF::Seconds::operator==):
(WTF::Seconds::operator!=):
(WTF::Seconds::operator<):
(WTF::Seconds::operator>):
(WTF::Seconds::operator<=):
(WTF::Seconds::operator>=):
* wtf/TimeWithDynamicClockType.cpp: Added.
(WTF::TimeWithDynamicClockType::now):
(WTF::TimeWithDynamicClockType::nowWithSameClock):
(WTF::TimeWithDynamicClockType::wallTime):
(WTF::TimeWithDynamicClockType::monotonicTime):
(WTF::TimeWithDynamicClockType::approximateWallTime):
(WTF::TimeWithDynamicClockType::approximateMonotonicTime):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator<):
(WTF::TimeWithDynamicClockType::operator>):
(WTF::TimeWithDynamicClockType::operator<=):
(WTF::TimeWithDynamicClockType::operator>=):
(WTF::TimeWithDynamicClockType::dump):
(WTF::TimeWithDynamicClockType::sleep):
* wtf/TimeWithDynamicClockType.h: Added.
(WTF::TimeWithDynamicClockType::TimeWithDynamicClockType):
(WTF::TimeWithDynamicClockType::fromRawDouble):
(WTF::TimeWithDynamicClockType::secondsSinceEpoch):
(WTF::TimeWithDynamicClockType::clockType):
(WTF::TimeWithDynamicClockType::withSameClockAndRawDouble):
(WTF::TimeWithDynamicClockType::operator bool):
(WTF::TimeWithDynamicClockType::operator+):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator+=):
(WTF::TimeWithDynamicClockType::operator-=):
(WTF::TimeWithDynamicClockType::operator==):
(WTF::TimeWithDynamicClockType::operator!=):
* wtf/WallTime.cpp: Added.
(WTF::WallTime::now):
(WTF::WallTime::approximateMonotonicTime):
(WTF::WallTime::dump):
(WTF::WallTime::sleep):
* wtf/WallTime.h: Added.
(WTF::WallTime::WallTime):
(WTF::WallTime::fromRawDouble):
(WTF::WallTime::infinity):
(WTF::WallTime::secondsSinceEpoch):
(WTF::WallTime::approximateWallTime):
(WTF::WallTime::operator bool):
(WTF::WallTime::operator+):
(WTF::WallTime::operator-):
(WTF::WallTime::operator+=):
(WTF::WallTime::operator-=):
(WTF::WallTime::operator==):
(WTF::WallTime::operator!=):
(WTF::WallTime::operator<):
(WTF::WallTime::operator>):
(WTF::WallTime::operator<=):
(WTF::WallTime::operator>=):
* wtf/threads/BinarySemaphore.cpp:
(WTF::BinarySemaphore::wait):
* wtf/threads/BinarySemaphore.h:
Tools:
* TestWebKitAPI/CMakeLists.txt:
* TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:
* TestWebKitAPI/Tests/WTF/Condition.cpp:
(TestWebKitAPI::TEST):
* TestWebKitAPI/Tests/WTF/SynchronizedFixedQueue.cpp:
(TestWebKitAPI::ToUpperConverter::stopProducing):
(TestWebKitAPI::ToUpperConverter::stopConsuming):
* TestWebKitAPI/Tests/WTF/Time.cpp: Added.
(WTF::operator<<):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/182152@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208415 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-05 03:02:39 +00:00
const TimeWithDynamicClockType & timeout )
2015-08-11 19:51:35 +00:00
{
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : parking. \n " ) ) ;
2015-08-11 19:51:35 +00:00
ThreadData * me = myThreadData ( ) ;
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
me - > token = 0 ;
2015-08-11 19:51:35 +00:00
2015-08-13 20:42:11 +00:00
// Guard against someone calling parkConditionally() recursively from beforeSleep().
RELEASE_ASSERT ( ! me - > address ) ;
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
bool enqueueResult = enqueue (
2015-08-11 19:51:35 +00:00
address ,
[ & ] ( ) - > ThreadData * {
if ( ! validation ( ) )
return nullptr ;
me - > address = address ;
return me ;
} ) ;
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
if ( ! enqueueResult )
return ParkResult ( ) ;
2015-08-11 19:51:35 +00:00
2015-08-13 20:42:11 +00:00
beforeSleep ( ) ;
2015-08-15 00:14:52 +00:00
2015-08-13 20:42:11 +00:00
bool didGetDequeued ;
2015-08-11 19:51:35 +00:00
{
WTF::ParkingLot should stop using std::chrono because std::chrono::duration casts are prone to overflows
https://bugs.webkit.org/show_bug.cgi?id=152045
Reviewed by Andy Estes.
Source/JavaScriptCore:
Probably the nicest example of why this patch is a good idea is the change in
AtomicsObject.cpp.
* jit/ICStats.cpp:
(JSC::ICStats::ICStats):
* runtime/AtomicsObject.cpp:
(JSC::atomicsFuncWait):
Source/WebCore:
No new layout tests because no new behavior. The new WTF time classes have some unit tests
in TestWebKitAPI.
* fileapi/ThreadableBlobRegistry.cpp:
(WebCore::ThreadableBlobRegistry::blobSize):
* platform/MainThreadSharedTimer.h:
* platform/SharedTimer.h:
* platform/ThreadTimers.cpp:
(WebCore::ThreadTimers::updateSharedTimer):
* platform/cf/MainThreadSharedTimerCF.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/efl/MainThreadSharedTimerEfl.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/glib/MainThreadSharedTimerGLib.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/win/MainThreadSharedTimerWin.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* workers/WorkerRunLoop.cpp:
(WebCore::WorkerRunLoop::runInMode):
Source/WebKit2:
* Platform/IPC/Connection.cpp:
(IPC::Connection::SyncMessageState::wait):
(IPC::Connection::sendMessage):
(IPC::Connection::timeoutRespectingIgnoreTimeoutsForTesting):
(IPC::Connection::waitForMessage):
(IPC::Connection::sendSyncMessage):
(IPC::Connection::waitForSyncReply):
* Platform/IPC/Connection.h:
(IPC::Connection::sendSync):
(IPC::Connection::waitForAndDispatchImmediately):
* Platform/IPC/MessageSender.h:
(IPC::MessageSender::sendSync):
* UIProcess/ChildProcessProxy.h:
(WebKit::ChildProcessProxy::sendSync):
* UIProcess/Network/NetworkProcessProxy.cpp:
(WebKit::NetworkProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/Storage/StorageManager.cpp:
(WebKit::StorageManager::applicationWillTerminate):
* UIProcess/WebProcessProxy.cpp:
(WebKit::WebProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/WebResourceLoadStatisticsStore.cpp:
(WebKit::WebResourceLoadStatisticsStore::applicationWillTerminate):
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.h:
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.mm:
(-[WKOneShotDisplayLinkHandler displayLinkFired:]):
(WebKit::RemoteLayerTreeDrawingAreaProxy::commitLayerTree):
(WebKit::RemoteLayerTreeDrawingAreaProxy::didRefreshDisplay):
(WebKit::RemoteLayerTreeDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/TiledCoreAnimationDrawingAreaProxy.mm:
(WebKit::TiledCoreAnimationDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/WKImmediateActionController.mm:
(-[WKImmediateActionController immediateActionRecognizerWillBeginAnimation:]):
* UIProcess/mac/WebPageProxyMac.mm:
(WebKit::WebPageProxy::stringSelectionForPasteboard):
(WebKit::WebPageProxy::dataSelectionForPasteboard):
(WebKit::WebPageProxy::readSelectionFromPasteboard):
(WebKit::WebPageProxy::shouldDelayWindowOrderingForEvent):
(WebKit::WebPageProxy::acceptsFirstMouse):
* WebProcess/WebCoreSupport/WebChromeClient.cpp:
(WebKit::WebChromeClient::runBeforeUnloadConfirmPanel):
(WebKit::WebChromeClient::runJavaScriptAlert):
(WebKit::WebChromeClient::runJavaScriptConfirm):
(WebKit::WebChromeClient::runJavaScriptPrompt):
(WebKit::WebChromeClient::print):
(WebKit::WebChromeClient::exceededDatabaseQuota):
(WebKit::WebChromeClient::reachedApplicationCacheOriginQuota):
* WebProcess/WebCoreSupport/WebFrameLoaderClient.cpp:
(WebKit::WebFrameLoaderClient::dispatchDecidePolicyForResponse):
* WebProcess/WebPage/WebPage.cpp:
(WebKit::WebPage::postSynchronousMessageForTesting):
Source/WTF:
We used to use 'double' for all time measurements. Sometimes it was milliseconds,
sometimes it was seconds. Sometimes we measured a span of time, sometimes we spoke of time
since some epoch. When we spoke of time since epoch, we either used a monotonic clock or
a wall clock. The type - always 'double' - never told us what kind of time we had, even
though there were roughly six of them (sec interval, ms interval, sec since epoch on wall,
ms since epoch on wall, sec since epoch monotonic, ms since epoch monotonic).
At some point, we thought that it would be a good idea to replace these doubles with
std::chrono. But since replacing some things with std::chrono, we found it to be terribly
inconvenient:
- Outrageous API. I never want to say std::chrono::milliseconds(blah). I never want to say
std::chrono::steady_clock::timepoint. The syntax for duration_cast is ugly, and ideally
duration_cast would not even be a thing.
- No overflow protection. std::chrono uses integers by default and using anything else is
clumsy. But the integer math is done without regard for the rough edges of integer math,
so any cast between std::chrono types risks overflow. Any comparison risks overflow
because it may do conversions silently. We have even found bugs where some C++
implementations had more overflows than others, which ends up being a special kind of
hell. In many cases, the overflow also has nasal demons.
It's an error to represent time using integers. It would have been excusable back when
floating point math was not guaranteed to be supported on all platforms, but that would
have been a long time ago. Time is a continuous, infinite concept and it's a perfect fit
for floating point:
- Floating point preserves precision under multiplication in all but extreme cases, so
using floating point for time means that unit conversions are almost completely
lossless. This means that we don't have to think very hard about what units to use. In
this patch, we use seconds almost everywhere. We only convert at boundaries, like an API
boundary that wants something other than seconds.
- Floating point makes it easy to reason about infinity, which is something that time code
wants to do a lot. Example: when would you like to timeout? Infinity please! This is the
most elegant way of having an API support both a timeout variant and a no-timeout
variant.
- Floating point does well-understood things when math goes wrong, and these things are
pretty well optimized to match what a mathematician would do when computing with real
numbers represented using scientific notation with a finite number of significant
digits. This means that time math under floating point looks like normal math. On the
other hand, std::chrono time math looks like garbage because you have to always check
for multiple possible UB corners whenever you touch large integers. Integers that
represent time are very likely to be large and you don't have to do much to overflow
them. At this time, based on the number of bugs we have already seen due to chrono
overflows, I am not certain that we even understand what are all of the corner cases
that we should even check for.
This patch introduces a new set of timekeeping classes that are all based on double, and
all internally use seconds. These classes support algebraic typing. The classes are:
- Seconds: this is for measuring a duration.
- WallTime: time since epoch according to a wall clock (aka real time clock).
- MonotonicTime: time since epoch according to a monotonic clock.
- ClockType: enum that says either Wall or Monotonic.
- TimeWithDynamicClockType: a tuple of double and ClockType, which represents either a
wall time or a monotonic time.
All of these classes behave like C++ values and are cheap to copy around since they are
very nearly POD. This supports comprehensive conversions between the various time types.
Most of this is by way of algebra. Here are just some of the rules we recognize:
WallTime = WallTime + Seconds
Seconds = WallTime - WallTime
MonotonicTime = MonotonicTime + Seconds
etc...
We support negative, infinite, and NaN times because math.
We support conversions between MonotonicTime and WallTime, like:
WallTime wt = mt.approximateWallTime()
This is called this "approximate" because the only way to do it is to get the current time
on both clocks and convert relative to that.
Many of our APIs would be happy using whatever notion of time the user wanted to use. For
those APIs, which includes Condition and ParkingLot, we have TimeWithDynamicClockType. You
can automatically convert WallTime or MonotonicTime to TimeWithDynamicClockType. This
means that if you use a WallTime with Condition::waitUntil, then Condition's internal
logic for when it should wake up makes its decision based on the current WallTime - but if
you use MonotonicTime then waitUntil will make its decision based on current
MonotonicTime. This is a greater level of flexibility than chrono allowed, since chrono
did not have the concept of a dynamic clock type.
This patch does not include conversions between std::chrono and these new time classes,
because past experience shows that we're quite bad at getting conversions between
std::chrono and anything else right. Also, I didn't need such conversion code because this
patch only converts code that transitively touches ParkingLot and Condition. It was easy
to get all of that code onto the new time classes.
* WTF.xcodeproj/project.pbxproj:
* wtf/AutomaticThread.cpp:
(WTF::AutomaticThread::start):
* wtf/CMakeLists.txt:
* wtf/ClockType.cpp: Added.
(WTF::printInternal):
* wtf/ClockType.h: Added.
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::waitFor):
(WTF::ConditionBase::wait):
(WTF::ConditionBase::waitUntilWallClockSeconds): Deleted.
(WTF::ConditionBase::waitUntilMonotonicClockSeconds): Deleted.
(WTF::ConditionBase::waitForSeconds): Deleted.
(WTF::ConditionBase::waitForSecondsImpl): Deleted.
(WTF::ConditionBase::waitForImpl): Deleted.
(WTF::ConditionBase::absoluteFromRelative): Deleted.
* wtf/CrossThreadQueue.h:
(WTF::CrossThreadQueue<DataType>::waitForMessage):
* wtf/CurrentTime.cpp:
(WTF::sleep):
* wtf/MessageQueue.h:
(WTF::MessageQueue::infiniteTime): Deleted.
* wtf/MonotonicTime.cpp: Added.
(WTF::MonotonicTime::now):
(WTF::MonotonicTime::approximateWallTime):
(WTF::MonotonicTime::dump):
(WTF::MonotonicTime::sleep):
* wtf/MonotonicTime.h: Added.
(WTF::MonotonicTime::MonotonicTime):
(WTF::MonotonicTime::fromRawDouble):
(WTF::MonotonicTime::infinity):
(WTF::MonotonicTime::secondsSinceEpoch):
(WTF::MonotonicTime::approximateMonotonicTime):
(WTF::MonotonicTime::operator bool):
(WTF::MonotonicTime::operator+):
(WTF::MonotonicTime::operator-):
(WTF::MonotonicTime::operator+=):
(WTF::MonotonicTime::operator-=):
(WTF::MonotonicTime::operator==):
(WTF::MonotonicTime::operator!=):
(WTF::MonotonicTime::operator<):
(WTF::MonotonicTime::operator>):
(WTF::MonotonicTime::operator<=):
(WTF::MonotonicTime::operator>=):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkCount):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
* wtf/Seconds.cpp: Added.
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::dump):
(WTF::Seconds::sleep):
* wtf/Seconds.h: Added.
(WTF::Seconds::Seconds):
(WTF::Seconds::value):
(WTF::Seconds::seconds):
(WTF::Seconds::milliseconds):
(WTF::Seconds::microseconds):
(WTF::Seconds::nanoseconds):
(WTF::Seconds::fromMilliseconds):
(WTF::Seconds::fromMicroseconds):
(WTF::Seconds::fromNanoseconds):
(WTF::Seconds::infinity):
(WTF::Seconds::operator bool):
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::operator*):
(WTF::Seconds::operator/):
(WTF::Seconds::operator+=):
(WTF::Seconds::operator-=):
(WTF::Seconds::operator*=):
(WTF::Seconds::operator/=):
(WTF::Seconds::operator==):
(WTF::Seconds::operator!=):
(WTF::Seconds::operator<):
(WTF::Seconds::operator>):
(WTF::Seconds::operator<=):
(WTF::Seconds::operator>=):
* wtf/TimeWithDynamicClockType.cpp: Added.
(WTF::TimeWithDynamicClockType::now):
(WTF::TimeWithDynamicClockType::nowWithSameClock):
(WTF::TimeWithDynamicClockType::wallTime):
(WTF::TimeWithDynamicClockType::monotonicTime):
(WTF::TimeWithDynamicClockType::approximateWallTime):
(WTF::TimeWithDynamicClockType::approximateMonotonicTime):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator<):
(WTF::TimeWithDynamicClockType::operator>):
(WTF::TimeWithDynamicClockType::operator<=):
(WTF::TimeWithDynamicClockType::operator>=):
(WTF::TimeWithDynamicClockType::dump):
(WTF::TimeWithDynamicClockType::sleep):
* wtf/TimeWithDynamicClockType.h: Added.
(WTF::TimeWithDynamicClockType::TimeWithDynamicClockType):
(WTF::TimeWithDynamicClockType::fromRawDouble):
(WTF::TimeWithDynamicClockType::secondsSinceEpoch):
(WTF::TimeWithDynamicClockType::clockType):
(WTF::TimeWithDynamicClockType::withSameClockAndRawDouble):
(WTF::TimeWithDynamicClockType::operator bool):
(WTF::TimeWithDynamicClockType::operator+):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator+=):
(WTF::TimeWithDynamicClockType::operator-=):
(WTF::TimeWithDynamicClockType::operator==):
(WTF::TimeWithDynamicClockType::operator!=):
* wtf/WallTime.cpp: Added.
(WTF::WallTime::now):
(WTF::WallTime::approximateMonotonicTime):
(WTF::WallTime::dump):
(WTF::WallTime::sleep):
* wtf/WallTime.h: Added.
(WTF::WallTime::WallTime):
(WTF::WallTime::fromRawDouble):
(WTF::WallTime::infinity):
(WTF::WallTime::secondsSinceEpoch):
(WTF::WallTime::approximateWallTime):
(WTF::WallTime::operator bool):
(WTF::WallTime::operator+):
(WTF::WallTime::operator-):
(WTF::WallTime::operator+=):
(WTF::WallTime::operator-=):
(WTF::WallTime::operator==):
(WTF::WallTime::operator!=):
(WTF::WallTime::operator<):
(WTF::WallTime::operator>):
(WTF::WallTime::operator<=):
(WTF::WallTime::operator>=):
* wtf/threads/BinarySemaphore.cpp:
(WTF::BinarySemaphore::wait):
* wtf/threads/BinarySemaphore.h:
Tools:
* TestWebKitAPI/CMakeLists.txt:
* TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:
* TestWebKitAPI/Tests/WTF/Condition.cpp:
(TestWebKitAPI::TEST):
* TestWebKitAPI/Tests/WTF/SynchronizedFixedQueue.cpp:
(TestWebKitAPI::ToUpperConverter::stopProducing):
(TestWebKitAPI::ToUpperConverter::stopConsuming):
* TestWebKitAPI/Tests/WTF/Time.cpp: Added.
(WTF::operator<<):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/182152@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208415 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-05 03:02:39 +00:00
MutexLocker locker ( me - > parkingLock ) ;
while ( me - > address & & timeout . nowWithSameClock ( ) < timeout ) {
me - > parkingCondition . timedWait (
2018-02-23 04:18:17 +00:00
me - > parkingLock , timeout . approximateWallTime ( ) ) ;
2015-08-15 00:14:52 +00:00
WTF::ParkingLot should stop using std::chrono because std::chrono::duration casts are prone to overflows
https://bugs.webkit.org/show_bug.cgi?id=152045
Reviewed by Andy Estes.
Source/JavaScriptCore:
Probably the nicest example of why this patch is a good idea is the change in
AtomicsObject.cpp.
* jit/ICStats.cpp:
(JSC::ICStats::ICStats):
* runtime/AtomicsObject.cpp:
(JSC::atomicsFuncWait):
Source/WebCore:
No new layout tests because no new behavior. The new WTF time classes have some unit tests
in TestWebKitAPI.
* fileapi/ThreadableBlobRegistry.cpp:
(WebCore::ThreadableBlobRegistry::blobSize):
* platform/MainThreadSharedTimer.h:
* platform/SharedTimer.h:
* platform/ThreadTimers.cpp:
(WebCore::ThreadTimers::updateSharedTimer):
* platform/cf/MainThreadSharedTimerCF.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/efl/MainThreadSharedTimerEfl.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/glib/MainThreadSharedTimerGLib.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/win/MainThreadSharedTimerWin.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* workers/WorkerRunLoop.cpp:
(WebCore::WorkerRunLoop::runInMode):
Source/WebKit2:
* Platform/IPC/Connection.cpp:
(IPC::Connection::SyncMessageState::wait):
(IPC::Connection::sendMessage):
(IPC::Connection::timeoutRespectingIgnoreTimeoutsForTesting):
(IPC::Connection::waitForMessage):
(IPC::Connection::sendSyncMessage):
(IPC::Connection::waitForSyncReply):
* Platform/IPC/Connection.h:
(IPC::Connection::sendSync):
(IPC::Connection::waitForAndDispatchImmediately):
* Platform/IPC/MessageSender.h:
(IPC::MessageSender::sendSync):
* UIProcess/ChildProcessProxy.h:
(WebKit::ChildProcessProxy::sendSync):
* UIProcess/Network/NetworkProcessProxy.cpp:
(WebKit::NetworkProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/Storage/StorageManager.cpp:
(WebKit::StorageManager::applicationWillTerminate):
* UIProcess/WebProcessProxy.cpp:
(WebKit::WebProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/WebResourceLoadStatisticsStore.cpp:
(WebKit::WebResourceLoadStatisticsStore::applicationWillTerminate):
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.h:
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.mm:
(-[WKOneShotDisplayLinkHandler displayLinkFired:]):
(WebKit::RemoteLayerTreeDrawingAreaProxy::commitLayerTree):
(WebKit::RemoteLayerTreeDrawingAreaProxy::didRefreshDisplay):
(WebKit::RemoteLayerTreeDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/TiledCoreAnimationDrawingAreaProxy.mm:
(WebKit::TiledCoreAnimationDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/WKImmediateActionController.mm:
(-[WKImmediateActionController immediateActionRecognizerWillBeginAnimation:]):
* UIProcess/mac/WebPageProxyMac.mm:
(WebKit::WebPageProxy::stringSelectionForPasteboard):
(WebKit::WebPageProxy::dataSelectionForPasteboard):
(WebKit::WebPageProxy::readSelectionFromPasteboard):
(WebKit::WebPageProxy::shouldDelayWindowOrderingForEvent):
(WebKit::WebPageProxy::acceptsFirstMouse):
* WebProcess/WebCoreSupport/WebChromeClient.cpp:
(WebKit::WebChromeClient::runBeforeUnloadConfirmPanel):
(WebKit::WebChromeClient::runJavaScriptAlert):
(WebKit::WebChromeClient::runJavaScriptConfirm):
(WebKit::WebChromeClient::runJavaScriptPrompt):
(WebKit::WebChromeClient::print):
(WebKit::WebChromeClient::exceededDatabaseQuota):
(WebKit::WebChromeClient::reachedApplicationCacheOriginQuota):
* WebProcess/WebCoreSupport/WebFrameLoaderClient.cpp:
(WebKit::WebFrameLoaderClient::dispatchDecidePolicyForResponse):
* WebProcess/WebPage/WebPage.cpp:
(WebKit::WebPage::postSynchronousMessageForTesting):
Source/WTF:
We used to use 'double' for all time measurements. Sometimes it was milliseconds,
sometimes it was seconds. Sometimes we measured a span of time, sometimes we spoke of time
since some epoch. When we spoke of time since epoch, we either used a monotonic clock or
a wall clock. The type - always 'double' - never told us what kind of time we had, even
though there were roughly six of them (sec interval, ms interval, sec since epoch on wall,
ms since epoch on wall, sec since epoch monotonic, ms since epoch monotonic).
At some point, we thought that it would be a good idea to replace these doubles with
std::chrono. But since replacing some things with std::chrono, we found it to be terribly
inconvenient:
- Outrageous API. I never want to say std::chrono::milliseconds(blah). I never want to say
std::chrono::steady_clock::timepoint. The syntax for duration_cast is ugly, and ideally
duration_cast would not even be a thing.
- No overflow protection. std::chrono uses integers by default and using anything else is
clumsy. But the integer math is done without regard for the rough edges of integer math,
so any cast between std::chrono types risks overflow. Any comparison risks overflow
because it may do conversions silently. We have even found bugs where some C++
implementations had more overflows than others, which ends up being a special kind of
hell. In many cases, the overflow also has nasal demons.
It's an error to represent time using integers. It would have been excusable back when
floating point math was not guaranteed to be supported on all platforms, but that would
have been a long time ago. Time is a continuous, infinite concept and it's a perfect fit
for floating point:
- Floating point preserves precision under multiplication in all but extreme cases, so
using floating point for time means that unit conversions are almost completely
lossless. This means that we don't have to think very hard about what units to use. In
this patch, we use seconds almost everywhere. We only convert at boundaries, like an API
boundary that wants something other than seconds.
- Floating point makes it easy to reason about infinity, which is something that time code
wants to do a lot. Example: when would you like to timeout? Infinity please! This is the
most elegant way of having an API support both a timeout variant and a no-timeout
variant.
- Floating point does well-understood things when math goes wrong, and these things are
pretty well optimized to match what a mathematician would do when computing with real
numbers represented using scientific notation with a finite number of significant
digits. This means that time math under floating point looks like normal math. On the
other hand, std::chrono time math looks like garbage because you have to always check
for multiple possible UB corners whenever you touch large integers. Integers that
represent time are very likely to be large and you don't have to do much to overflow
them. At this time, based on the number of bugs we have already seen due to chrono
overflows, I am not certain that we even understand what are all of the corner cases
that we should even check for.
This patch introduces a new set of timekeeping classes that are all based on double, and
all internally use seconds. These classes support algebraic typing. The classes are:
- Seconds: this is for measuring a duration.
- WallTime: time since epoch according to a wall clock (aka real time clock).
- MonotonicTime: time since epoch according to a monotonic clock.
- ClockType: enum that says either Wall or Monotonic.
- TimeWithDynamicClockType: a tuple of double and ClockType, which represents either a
wall time or a monotonic time.
All of these classes behave like C++ values and are cheap to copy around since they are
very nearly POD. This supports comprehensive conversions between the various time types.
Most of this is by way of algebra. Here are just some of the rules we recognize:
WallTime = WallTime + Seconds
Seconds = WallTime - WallTime
MonotonicTime = MonotonicTime + Seconds
etc...
We support negative, infinite, and NaN times because math.
We support conversions between MonotonicTime and WallTime, like:
WallTime wt = mt.approximateWallTime()
This is called this "approximate" because the only way to do it is to get the current time
on both clocks and convert relative to that.
Many of our APIs would be happy using whatever notion of time the user wanted to use. For
those APIs, which includes Condition and ParkingLot, we have TimeWithDynamicClockType. You
can automatically convert WallTime or MonotonicTime to TimeWithDynamicClockType. This
means that if you use a WallTime with Condition::waitUntil, then Condition's internal
logic for when it should wake up makes its decision based on the current WallTime - but if
you use MonotonicTime then waitUntil will make its decision based on current
MonotonicTime. This is a greater level of flexibility than chrono allowed, since chrono
did not have the concept of a dynamic clock type.
This patch does not include conversions between std::chrono and these new time classes,
because past experience shows that we're quite bad at getting conversions between
std::chrono and anything else right. Also, I didn't need such conversion code because this
patch only converts code that transitively touches ParkingLot and Condition. It was easy
to get all of that code onto the new time classes.
* WTF.xcodeproj/project.pbxproj:
* wtf/AutomaticThread.cpp:
(WTF::AutomaticThread::start):
* wtf/CMakeLists.txt:
* wtf/ClockType.cpp: Added.
(WTF::printInternal):
* wtf/ClockType.h: Added.
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::waitFor):
(WTF::ConditionBase::wait):
(WTF::ConditionBase::waitUntilWallClockSeconds): Deleted.
(WTF::ConditionBase::waitUntilMonotonicClockSeconds): Deleted.
(WTF::ConditionBase::waitForSeconds): Deleted.
(WTF::ConditionBase::waitForSecondsImpl): Deleted.
(WTF::ConditionBase::waitForImpl): Deleted.
(WTF::ConditionBase::absoluteFromRelative): Deleted.
* wtf/CrossThreadQueue.h:
(WTF::CrossThreadQueue<DataType>::waitForMessage):
* wtf/CurrentTime.cpp:
(WTF::sleep):
* wtf/MessageQueue.h:
(WTF::MessageQueue::infiniteTime): Deleted.
* wtf/MonotonicTime.cpp: Added.
(WTF::MonotonicTime::now):
(WTF::MonotonicTime::approximateWallTime):
(WTF::MonotonicTime::dump):
(WTF::MonotonicTime::sleep):
* wtf/MonotonicTime.h: Added.
(WTF::MonotonicTime::MonotonicTime):
(WTF::MonotonicTime::fromRawDouble):
(WTF::MonotonicTime::infinity):
(WTF::MonotonicTime::secondsSinceEpoch):
(WTF::MonotonicTime::approximateMonotonicTime):
(WTF::MonotonicTime::operator bool):
(WTF::MonotonicTime::operator+):
(WTF::MonotonicTime::operator-):
(WTF::MonotonicTime::operator+=):
(WTF::MonotonicTime::operator-=):
(WTF::MonotonicTime::operator==):
(WTF::MonotonicTime::operator!=):
(WTF::MonotonicTime::operator<):
(WTF::MonotonicTime::operator>):
(WTF::MonotonicTime::operator<=):
(WTF::MonotonicTime::operator>=):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkCount):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
* wtf/Seconds.cpp: Added.
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::dump):
(WTF::Seconds::sleep):
* wtf/Seconds.h: Added.
(WTF::Seconds::Seconds):
(WTF::Seconds::value):
(WTF::Seconds::seconds):
(WTF::Seconds::milliseconds):
(WTF::Seconds::microseconds):
(WTF::Seconds::nanoseconds):
(WTF::Seconds::fromMilliseconds):
(WTF::Seconds::fromMicroseconds):
(WTF::Seconds::fromNanoseconds):
(WTF::Seconds::infinity):
(WTF::Seconds::operator bool):
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::operator*):
(WTF::Seconds::operator/):
(WTF::Seconds::operator+=):
(WTF::Seconds::operator-=):
(WTF::Seconds::operator*=):
(WTF::Seconds::operator/=):
(WTF::Seconds::operator==):
(WTF::Seconds::operator!=):
(WTF::Seconds::operator<):
(WTF::Seconds::operator>):
(WTF::Seconds::operator<=):
(WTF::Seconds::operator>=):
* wtf/TimeWithDynamicClockType.cpp: Added.
(WTF::TimeWithDynamicClockType::now):
(WTF::TimeWithDynamicClockType::nowWithSameClock):
(WTF::TimeWithDynamicClockType::wallTime):
(WTF::TimeWithDynamicClockType::monotonicTime):
(WTF::TimeWithDynamicClockType::approximateWallTime):
(WTF::TimeWithDynamicClockType::approximateMonotonicTime):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator<):
(WTF::TimeWithDynamicClockType::operator>):
(WTF::TimeWithDynamicClockType::operator<=):
(WTF::TimeWithDynamicClockType::operator>=):
(WTF::TimeWithDynamicClockType::dump):
(WTF::TimeWithDynamicClockType::sleep):
* wtf/TimeWithDynamicClockType.h: Added.
(WTF::TimeWithDynamicClockType::TimeWithDynamicClockType):
(WTF::TimeWithDynamicClockType::fromRawDouble):
(WTF::TimeWithDynamicClockType::secondsSinceEpoch):
(WTF::TimeWithDynamicClockType::clockType):
(WTF::TimeWithDynamicClockType::withSameClockAndRawDouble):
(WTF::TimeWithDynamicClockType::operator bool):
(WTF::TimeWithDynamicClockType::operator+):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator+=):
(WTF::TimeWithDynamicClockType::operator-=):
(WTF::TimeWithDynamicClockType::operator==):
(WTF::TimeWithDynamicClockType::operator!=):
* wtf/WallTime.cpp: Added.
(WTF::WallTime::now):
(WTF::WallTime::approximateMonotonicTime):
(WTF::WallTime::dump):
(WTF::WallTime::sleep):
* wtf/WallTime.h: Added.
(WTF::WallTime::WallTime):
(WTF::WallTime::fromRawDouble):
(WTF::WallTime::infinity):
(WTF::WallTime::secondsSinceEpoch):
(WTF::WallTime::approximateWallTime):
(WTF::WallTime::operator bool):
(WTF::WallTime::operator+):
(WTF::WallTime::operator-):
(WTF::WallTime::operator+=):
(WTF::WallTime::operator-=):
(WTF::WallTime::operator==):
(WTF::WallTime::operator!=):
(WTF::WallTime::operator<):
(WTF::WallTime::operator>):
(WTF::WallTime::operator<=):
(WTF::WallTime::operator>=):
* wtf/threads/BinarySemaphore.cpp:
(WTF::BinarySemaphore::wait):
* wtf/threads/BinarySemaphore.h:
Tools:
* TestWebKitAPI/CMakeLists.txt:
* TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:
* TestWebKitAPI/Tests/WTF/Condition.cpp:
(TestWebKitAPI::TEST):
* TestWebKitAPI/Tests/WTF/SynchronizedFixedQueue.cpp:
(TestWebKitAPI::ToUpperConverter::stopProducing):
(TestWebKitAPI::ToUpperConverter::stopConsuming):
* TestWebKitAPI/Tests/WTF/Time.cpp: Added.
(WTF::operator<<):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/182152@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208415 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-05 03:02:39 +00:00
// It's possible for the OS to decide not to wait. If it does that then it will also
// decide not to release the lock. If there's a bug in the time math, then this could
// result in a deadlock. Flashing the lock means that at worst it's just a CPU-eating
// spin.
me - > parkingLock . unlock ( ) ;
me - > parkingLock . lock ( ) ;
2015-08-15 00:14:52 +00:00
}
2015-08-13 20:42:11 +00:00
ASSERT ( ! me - > address | | me - > address = = address ) ;
didGetDequeued = ! me - > address ;
2015-08-11 19:51:35 +00:00
}
2015-08-15 00:14:52 +00:00
2015-08-13 20:42:11 +00:00
if ( didGetDequeued ) {
// Great! We actually got dequeued rather than the timeout expiring.
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
ParkResult result ;
result . wasUnparked = true ;
result . token = me - > token ;
return result ;
2015-08-13 20:42:11 +00:00
}
// Have to remove ourselves from the queue since we timed out and nobody has dequeued us yet.
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
bool didDequeue = false ;
2015-08-13 20:42:11 +00:00
dequeue (
address , BucketMode : : IgnoreEmpty ,
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
[ & ] ( ThreadData * element , bool ) {
if ( element = = me ) {
didDequeue = true ;
2015-08-13 20:42:11 +00:00
return DequeueResult : : RemoveAndStop ;
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
}
2015-08-13 20:42:11 +00:00
return DequeueResult : : Ignore ;
} ,
[ ] ( bool ) { } ) ;
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
// If didDequeue is true, then we dequeued ourselves. This means that we were not unparked.
// If didDequeue is false, then someone unparked us.
RELEASE_ASSERT ( ! me - > nextInQueue ) ;
2015-08-13 20:42:11 +00:00
// Make sure that no matter what, me->address is null after this point.
{
WTF::ParkingLot should stop using std::chrono because std::chrono::duration casts are prone to overflows
https://bugs.webkit.org/show_bug.cgi?id=152045
Reviewed by Andy Estes.
Source/JavaScriptCore:
Probably the nicest example of why this patch is a good idea is the change in
AtomicsObject.cpp.
* jit/ICStats.cpp:
(JSC::ICStats::ICStats):
* runtime/AtomicsObject.cpp:
(JSC::atomicsFuncWait):
Source/WebCore:
No new layout tests because no new behavior. The new WTF time classes have some unit tests
in TestWebKitAPI.
* fileapi/ThreadableBlobRegistry.cpp:
(WebCore::ThreadableBlobRegistry::blobSize):
* platform/MainThreadSharedTimer.h:
* platform/SharedTimer.h:
* platform/ThreadTimers.cpp:
(WebCore::ThreadTimers::updateSharedTimer):
* platform/cf/MainThreadSharedTimerCF.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/efl/MainThreadSharedTimerEfl.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/glib/MainThreadSharedTimerGLib.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/win/MainThreadSharedTimerWin.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* workers/WorkerRunLoop.cpp:
(WebCore::WorkerRunLoop::runInMode):
Source/WebKit2:
* Platform/IPC/Connection.cpp:
(IPC::Connection::SyncMessageState::wait):
(IPC::Connection::sendMessage):
(IPC::Connection::timeoutRespectingIgnoreTimeoutsForTesting):
(IPC::Connection::waitForMessage):
(IPC::Connection::sendSyncMessage):
(IPC::Connection::waitForSyncReply):
* Platform/IPC/Connection.h:
(IPC::Connection::sendSync):
(IPC::Connection::waitForAndDispatchImmediately):
* Platform/IPC/MessageSender.h:
(IPC::MessageSender::sendSync):
* UIProcess/ChildProcessProxy.h:
(WebKit::ChildProcessProxy::sendSync):
* UIProcess/Network/NetworkProcessProxy.cpp:
(WebKit::NetworkProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/Storage/StorageManager.cpp:
(WebKit::StorageManager::applicationWillTerminate):
* UIProcess/WebProcessProxy.cpp:
(WebKit::WebProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/WebResourceLoadStatisticsStore.cpp:
(WebKit::WebResourceLoadStatisticsStore::applicationWillTerminate):
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.h:
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.mm:
(-[WKOneShotDisplayLinkHandler displayLinkFired:]):
(WebKit::RemoteLayerTreeDrawingAreaProxy::commitLayerTree):
(WebKit::RemoteLayerTreeDrawingAreaProxy::didRefreshDisplay):
(WebKit::RemoteLayerTreeDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/TiledCoreAnimationDrawingAreaProxy.mm:
(WebKit::TiledCoreAnimationDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/WKImmediateActionController.mm:
(-[WKImmediateActionController immediateActionRecognizerWillBeginAnimation:]):
* UIProcess/mac/WebPageProxyMac.mm:
(WebKit::WebPageProxy::stringSelectionForPasteboard):
(WebKit::WebPageProxy::dataSelectionForPasteboard):
(WebKit::WebPageProxy::readSelectionFromPasteboard):
(WebKit::WebPageProxy::shouldDelayWindowOrderingForEvent):
(WebKit::WebPageProxy::acceptsFirstMouse):
* WebProcess/WebCoreSupport/WebChromeClient.cpp:
(WebKit::WebChromeClient::runBeforeUnloadConfirmPanel):
(WebKit::WebChromeClient::runJavaScriptAlert):
(WebKit::WebChromeClient::runJavaScriptConfirm):
(WebKit::WebChromeClient::runJavaScriptPrompt):
(WebKit::WebChromeClient::print):
(WebKit::WebChromeClient::exceededDatabaseQuota):
(WebKit::WebChromeClient::reachedApplicationCacheOriginQuota):
* WebProcess/WebCoreSupport/WebFrameLoaderClient.cpp:
(WebKit::WebFrameLoaderClient::dispatchDecidePolicyForResponse):
* WebProcess/WebPage/WebPage.cpp:
(WebKit::WebPage::postSynchronousMessageForTesting):
Source/WTF:
We used to use 'double' for all time measurements. Sometimes it was milliseconds,
sometimes it was seconds. Sometimes we measured a span of time, sometimes we spoke of time
since some epoch. When we spoke of time since epoch, we either used a monotonic clock or
a wall clock. The type - always 'double' - never told us what kind of time we had, even
though there were roughly six of them (sec interval, ms interval, sec since epoch on wall,
ms since epoch on wall, sec since epoch monotonic, ms since epoch monotonic).
At some point, we thought that it would be a good idea to replace these doubles with
std::chrono. But since replacing some things with std::chrono, we found it to be terribly
inconvenient:
- Outrageous API. I never want to say std::chrono::milliseconds(blah). I never want to say
std::chrono::steady_clock::timepoint. The syntax for duration_cast is ugly, and ideally
duration_cast would not even be a thing.
- No overflow protection. std::chrono uses integers by default and using anything else is
clumsy. But the integer math is done without regard for the rough edges of integer math,
so any cast between std::chrono types risks overflow. Any comparison risks overflow
because it may do conversions silently. We have even found bugs where some C++
implementations had more overflows than others, which ends up being a special kind of
hell. In many cases, the overflow also has nasal demons.
It's an error to represent time using integers. It would have been excusable back when
floating point math was not guaranteed to be supported on all platforms, but that would
have been a long time ago. Time is a continuous, infinite concept and it's a perfect fit
for floating point:
- Floating point preserves precision under multiplication in all but extreme cases, so
using floating point for time means that unit conversions are almost completely
lossless. This means that we don't have to think very hard about what units to use. In
this patch, we use seconds almost everywhere. We only convert at boundaries, like an API
boundary that wants something other than seconds.
- Floating point makes it easy to reason about infinity, which is something that time code
wants to do a lot. Example: when would you like to timeout? Infinity please! This is the
most elegant way of having an API support both a timeout variant and a no-timeout
variant.
- Floating point does well-understood things when math goes wrong, and these things are
pretty well optimized to match what a mathematician would do when computing with real
numbers represented using scientific notation with a finite number of significant
digits. This means that time math under floating point looks like normal math. On the
other hand, std::chrono time math looks like garbage because you have to always check
for multiple possible UB corners whenever you touch large integers. Integers that
represent time are very likely to be large and you don't have to do much to overflow
them. At this time, based on the number of bugs we have already seen due to chrono
overflows, I am not certain that we even understand what are all of the corner cases
that we should even check for.
This patch introduces a new set of timekeeping classes that are all based on double, and
all internally use seconds. These classes support algebraic typing. The classes are:
- Seconds: this is for measuring a duration.
- WallTime: time since epoch according to a wall clock (aka real time clock).
- MonotonicTime: time since epoch according to a monotonic clock.
- ClockType: enum that says either Wall or Monotonic.
- TimeWithDynamicClockType: a tuple of double and ClockType, which represents either a
wall time or a monotonic time.
All of these classes behave like C++ values and are cheap to copy around since they are
very nearly POD. This supports comprehensive conversions between the various time types.
Most of this is by way of algebra. Here are just some of the rules we recognize:
WallTime = WallTime + Seconds
Seconds = WallTime - WallTime
MonotonicTime = MonotonicTime + Seconds
etc...
We support negative, infinite, and NaN times because math.
We support conversions between MonotonicTime and WallTime, like:
WallTime wt = mt.approximateWallTime()
This is called this "approximate" because the only way to do it is to get the current time
on both clocks and convert relative to that.
Many of our APIs would be happy using whatever notion of time the user wanted to use. For
those APIs, which includes Condition and ParkingLot, we have TimeWithDynamicClockType. You
can automatically convert WallTime or MonotonicTime to TimeWithDynamicClockType. This
means that if you use a WallTime with Condition::waitUntil, then Condition's internal
logic for when it should wake up makes its decision based on the current WallTime - but if
you use MonotonicTime then waitUntil will make its decision based on current
MonotonicTime. This is a greater level of flexibility than chrono allowed, since chrono
did not have the concept of a dynamic clock type.
This patch does not include conversions between std::chrono and these new time classes,
because past experience shows that we're quite bad at getting conversions between
std::chrono and anything else right. Also, I didn't need such conversion code because this
patch only converts code that transitively touches ParkingLot and Condition. It was easy
to get all of that code onto the new time classes.
* WTF.xcodeproj/project.pbxproj:
* wtf/AutomaticThread.cpp:
(WTF::AutomaticThread::start):
* wtf/CMakeLists.txt:
* wtf/ClockType.cpp: Added.
(WTF::printInternal):
* wtf/ClockType.h: Added.
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::waitFor):
(WTF::ConditionBase::wait):
(WTF::ConditionBase::waitUntilWallClockSeconds): Deleted.
(WTF::ConditionBase::waitUntilMonotonicClockSeconds): Deleted.
(WTF::ConditionBase::waitForSeconds): Deleted.
(WTF::ConditionBase::waitForSecondsImpl): Deleted.
(WTF::ConditionBase::waitForImpl): Deleted.
(WTF::ConditionBase::absoluteFromRelative): Deleted.
* wtf/CrossThreadQueue.h:
(WTF::CrossThreadQueue<DataType>::waitForMessage):
* wtf/CurrentTime.cpp:
(WTF::sleep):
* wtf/MessageQueue.h:
(WTF::MessageQueue::infiniteTime): Deleted.
* wtf/MonotonicTime.cpp: Added.
(WTF::MonotonicTime::now):
(WTF::MonotonicTime::approximateWallTime):
(WTF::MonotonicTime::dump):
(WTF::MonotonicTime::sleep):
* wtf/MonotonicTime.h: Added.
(WTF::MonotonicTime::MonotonicTime):
(WTF::MonotonicTime::fromRawDouble):
(WTF::MonotonicTime::infinity):
(WTF::MonotonicTime::secondsSinceEpoch):
(WTF::MonotonicTime::approximateMonotonicTime):
(WTF::MonotonicTime::operator bool):
(WTF::MonotonicTime::operator+):
(WTF::MonotonicTime::operator-):
(WTF::MonotonicTime::operator+=):
(WTF::MonotonicTime::operator-=):
(WTF::MonotonicTime::operator==):
(WTF::MonotonicTime::operator!=):
(WTF::MonotonicTime::operator<):
(WTF::MonotonicTime::operator>):
(WTF::MonotonicTime::operator<=):
(WTF::MonotonicTime::operator>=):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkCount):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
* wtf/Seconds.cpp: Added.
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::dump):
(WTF::Seconds::sleep):
* wtf/Seconds.h: Added.
(WTF::Seconds::Seconds):
(WTF::Seconds::value):
(WTF::Seconds::seconds):
(WTF::Seconds::milliseconds):
(WTF::Seconds::microseconds):
(WTF::Seconds::nanoseconds):
(WTF::Seconds::fromMilliseconds):
(WTF::Seconds::fromMicroseconds):
(WTF::Seconds::fromNanoseconds):
(WTF::Seconds::infinity):
(WTF::Seconds::operator bool):
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::operator*):
(WTF::Seconds::operator/):
(WTF::Seconds::operator+=):
(WTF::Seconds::operator-=):
(WTF::Seconds::operator*=):
(WTF::Seconds::operator/=):
(WTF::Seconds::operator==):
(WTF::Seconds::operator!=):
(WTF::Seconds::operator<):
(WTF::Seconds::operator>):
(WTF::Seconds::operator<=):
(WTF::Seconds::operator>=):
* wtf/TimeWithDynamicClockType.cpp: Added.
(WTF::TimeWithDynamicClockType::now):
(WTF::TimeWithDynamicClockType::nowWithSameClock):
(WTF::TimeWithDynamicClockType::wallTime):
(WTF::TimeWithDynamicClockType::monotonicTime):
(WTF::TimeWithDynamicClockType::approximateWallTime):
(WTF::TimeWithDynamicClockType::approximateMonotonicTime):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator<):
(WTF::TimeWithDynamicClockType::operator>):
(WTF::TimeWithDynamicClockType::operator<=):
(WTF::TimeWithDynamicClockType::operator>=):
(WTF::TimeWithDynamicClockType::dump):
(WTF::TimeWithDynamicClockType::sleep):
* wtf/TimeWithDynamicClockType.h: Added.
(WTF::TimeWithDynamicClockType::TimeWithDynamicClockType):
(WTF::TimeWithDynamicClockType::fromRawDouble):
(WTF::TimeWithDynamicClockType::secondsSinceEpoch):
(WTF::TimeWithDynamicClockType::clockType):
(WTF::TimeWithDynamicClockType::withSameClockAndRawDouble):
(WTF::TimeWithDynamicClockType::operator bool):
(WTF::TimeWithDynamicClockType::operator+):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator+=):
(WTF::TimeWithDynamicClockType::operator-=):
(WTF::TimeWithDynamicClockType::operator==):
(WTF::TimeWithDynamicClockType::operator!=):
* wtf/WallTime.cpp: Added.
(WTF::WallTime::now):
(WTF::WallTime::approximateMonotonicTime):
(WTF::WallTime::dump):
(WTF::WallTime::sleep):
* wtf/WallTime.h: Added.
(WTF::WallTime::WallTime):
(WTF::WallTime::fromRawDouble):
(WTF::WallTime::infinity):
(WTF::WallTime::secondsSinceEpoch):
(WTF::WallTime::approximateWallTime):
(WTF::WallTime::operator bool):
(WTF::WallTime::operator+):
(WTF::WallTime::operator-):
(WTF::WallTime::operator+=):
(WTF::WallTime::operator-=):
(WTF::WallTime::operator==):
(WTF::WallTime::operator!=):
(WTF::WallTime::operator<):
(WTF::WallTime::operator>):
(WTF::WallTime::operator<=):
(WTF::WallTime::operator>=):
* wtf/threads/BinarySemaphore.cpp:
(WTF::BinarySemaphore::wait):
* wtf/threads/BinarySemaphore.h:
Tools:
* TestWebKitAPI/CMakeLists.txt:
* TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:
* TestWebKitAPI/Tests/WTF/Condition.cpp:
(TestWebKitAPI::TEST):
* TestWebKitAPI/Tests/WTF/SynchronizedFixedQueue.cpp:
(TestWebKitAPI::ToUpperConverter::stopProducing):
(TestWebKitAPI::ToUpperConverter::stopConsuming):
* TestWebKitAPI/Tests/WTF/Time.cpp: Added.
(WTF::operator<<):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/182152@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208415 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-05 03:02:39 +00:00
MutexLocker locker ( me - > parkingLock ) ;
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
if ( ! didDequeue ) {
REGRESSION (r207480): 3 Dromaeo tests failing
https://bugs.webkit.org/show_bug.cgi?id=163633
Reviewed by Mark Lam.
It's a ParkingLot bug: if we timeout and get unparked at the same time, then the unparking
thread will clear our address eventually - but not immediately. This causes a nasty
assertion failure. The tricky thing is that when we detect this, we need to wait until that
unparking thread does the deed. Otherwise, they will still do it at some later time.
Alternatively, we could use some kind of versioning to detect this - increment the version
when you park, so that unparking threads will know if they are time travelers. That seems
more yucky.
I don't think it matters too much what we do here, so long as it's simple and correct, since
this requires a race that is rare enough that it didn't happen once in my testing, and I was
pretty thorough. For example, it didn't happen once in 15 runs of JetStream. The race is
rare because it requires the timeout to happen right as someone else is unparking. Since
it's so rare, its probably OK that the unparked thread waits just a tiny moment until the
unparking thread is done.
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOneImpl):
Canonical link: https://commits.webkit.org/181485@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@207577 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-10-19 23:37:20 +00:00
// If we did not dequeue ourselves, then someone else did. They will set our address to
// null. We don't want to proceed until they do this, because otherwise, they may set
// our address to null in some distant future when we're already trying to wait for
// other things.
while ( me - > address )
WTF::ParkingLot should stop using std::chrono because std::chrono::duration casts are prone to overflows
https://bugs.webkit.org/show_bug.cgi?id=152045
Reviewed by Andy Estes.
Source/JavaScriptCore:
Probably the nicest example of why this patch is a good idea is the change in
AtomicsObject.cpp.
* jit/ICStats.cpp:
(JSC::ICStats::ICStats):
* runtime/AtomicsObject.cpp:
(JSC::atomicsFuncWait):
Source/WebCore:
No new layout tests because no new behavior. The new WTF time classes have some unit tests
in TestWebKitAPI.
* fileapi/ThreadableBlobRegistry.cpp:
(WebCore::ThreadableBlobRegistry::blobSize):
* platform/MainThreadSharedTimer.h:
* platform/SharedTimer.h:
* platform/ThreadTimers.cpp:
(WebCore::ThreadTimers::updateSharedTimer):
* platform/cf/MainThreadSharedTimerCF.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/efl/MainThreadSharedTimerEfl.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/glib/MainThreadSharedTimerGLib.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/win/MainThreadSharedTimerWin.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* workers/WorkerRunLoop.cpp:
(WebCore::WorkerRunLoop::runInMode):
Source/WebKit2:
* Platform/IPC/Connection.cpp:
(IPC::Connection::SyncMessageState::wait):
(IPC::Connection::sendMessage):
(IPC::Connection::timeoutRespectingIgnoreTimeoutsForTesting):
(IPC::Connection::waitForMessage):
(IPC::Connection::sendSyncMessage):
(IPC::Connection::waitForSyncReply):
* Platform/IPC/Connection.h:
(IPC::Connection::sendSync):
(IPC::Connection::waitForAndDispatchImmediately):
* Platform/IPC/MessageSender.h:
(IPC::MessageSender::sendSync):
* UIProcess/ChildProcessProxy.h:
(WebKit::ChildProcessProxy::sendSync):
* UIProcess/Network/NetworkProcessProxy.cpp:
(WebKit::NetworkProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/Storage/StorageManager.cpp:
(WebKit::StorageManager::applicationWillTerminate):
* UIProcess/WebProcessProxy.cpp:
(WebKit::WebProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/WebResourceLoadStatisticsStore.cpp:
(WebKit::WebResourceLoadStatisticsStore::applicationWillTerminate):
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.h:
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.mm:
(-[WKOneShotDisplayLinkHandler displayLinkFired:]):
(WebKit::RemoteLayerTreeDrawingAreaProxy::commitLayerTree):
(WebKit::RemoteLayerTreeDrawingAreaProxy::didRefreshDisplay):
(WebKit::RemoteLayerTreeDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/TiledCoreAnimationDrawingAreaProxy.mm:
(WebKit::TiledCoreAnimationDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/WKImmediateActionController.mm:
(-[WKImmediateActionController immediateActionRecognizerWillBeginAnimation:]):
* UIProcess/mac/WebPageProxyMac.mm:
(WebKit::WebPageProxy::stringSelectionForPasteboard):
(WebKit::WebPageProxy::dataSelectionForPasteboard):
(WebKit::WebPageProxy::readSelectionFromPasteboard):
(WebKit::WebPageProxy::shouldDelayWindowOrderingForEvent):
(WebKit::WebPageProxy::acceptsFirstMouse):
* WebProcess/WebCoreSupport/WebChromeClient.cpp:
(WebKit::WebChromeClient::runBeforeUnloadConfirmPanel):
(WebKit::WebChromeClient::runJavaScriptAlert):
(WebKit::WebChromeClient::runJavaScriptConfirm):
(WebKit::WebChromeClient::runJavaScriptPrompt):
(WebKit::WebChromeClient::print):
(WebKit::WebChromeClient::exceededDatabaseQuota):
(WebKit::WebChromeClient::reachedApplicationCacheOriginQuota):
* WebProcess/WebCoreSupport/WebFrameLoaderClient.cpp:
(WebKit::WebFrameLoaderClient::dispatchDecidePolicyForResponse):
* WebProcess/WebPage/WebPage.cpp:
(WebKit::WebPage::postSynchronousMessageForTesting):
Source/WTF:
We used to use 'double' for all time measurements. Sometimes it was milliseconds,
sometimes it was seconds. Sometimes we measured a span of time, sometimes we spoke of time
since some epoch. When we spoke of time since epoch, we either used a monotonic clock or
a wall clock. The type - always 'double' - never told us what kind of time we had, even
though there were roughly six of them (sec interval, ms interval, sec since epoch on wall,
ms since epoch on wall, sec since epoch monotonic, ms since epoch monotonic).
At some point, we thought that it would be a good idea to replace these doubles with
std::chrono. But since replacing some things with std::chrono, we found it to be terribly
inconvenient:
- Outrageous API. I never want to say std::chrono::milliseconds(blah). I never want to say
std::chrono::steady_clock::timepoint. The syntax for duration_cast is ugly, and ideally
duration_cast would not even be a thing.
- No overflow protection. std::chrono uses integers by default and using anything else is
clumsy. But the integer math is done without regard for the rough edges of integer math,
so any cast between std::chrono types risks overflow. Any comparison risks overflow
because it may do conversions silently. We have even found bugs where some C++
implementations had more overflows than others, which ends up being a special kind of
hell. In many cases, the overflow also has nasal demons.
It's an error to represent time using integers. It would have been excusable back when
floating point math was not guaranteed to be supported on all platforms, but that would
have been a long time ago. Time is a continuous, infinite concept and it's a perfect fit
for floating point:
- Floating point preserves precision under multiplication in all but extreme cases, so
using floating point for time means that unit conversions are almost completely
lossless. This means that we don't have to think very hard about what units to use. In
this patch, we use seconds almost everywhere. We only convert at boundaries, like an API
boundary that wants something other than seconds.
- Floating point makes it easy to reason about infinity, which is something that time code
wants to do a lot. Example: when would you like to timeout? Infinity please! This is the
most elegant way of having an API support both a timeout variant and a no-timeout
variant.
- Floating point does well-understood things when math goes wrong, and these things are
pretty well optimized to match what a mathematician would do when computing with real
numbers represented using scientific notation with a finite number of significant
digits. This means that time math under floating point looks like normal math. On the
other hand, std::chrono time math looks like garbage because you have to always check
for multiple possible UB corners whenever you touch large integers. Integers that
represent time are very likely to be large and you don't have to do much to overflow
them. At this time, based on the number of bugs we have already seen due to chrono
overflows, I am not certain that we even understand what are all of the corner cases
that we should even check for.
This patch introduces a new set of timekeeping classes that are all based on double, and
all internally use seconds. These classes support algebraic typing. The classes are:
- Seconds: this is for measuring a duration.
- WallTime: time since epoch according to a wall clock (aka real time clock).
- MonotonicTime: time since epoch according to a monotonic clock.
- ClockType: enum that says either Wall or Monotonic.
- TimeWithDynamicClockType: a tuple of double and ClockType, which represents either a
wall time or a monotonic time.
All of these classes behave like C++ values and are cheap to copy around since they are
very nearly POD. This supports comprehensive conversions between the various time types.
Most of this is by way of algebra. Here are just some of the rules we recognize:
WallTime = WallTime + Seconds
Seconds = WallTime - WallTime
MonotonicTime = MonotonicTime + Seconds
etc...
We support negative, infinite, and NaN times because math.
We support conversions between MonotonicTime and WallTime, like:
WallTime wt = mt.approximateWallTime()
This is called this "approximate" because the only way to do it is to get the current time
on both clocks and convert relative to that.
Many of our APIs would be happy using whatever notion of time the user wanted to use. For
those APIs, which includes Condition and ParkingLot, we have TimeWithDynamicClockType. You
can automatically convert WallTime or MonotonicTime to TimeWithDynamicClockType. This
means that if you use a WallTime with Condition::waitUntil, then Condition's internal
logic for when it should wake up makes its decision based on the current WallTime - but if
you use MonotonicTime then waitUntil will make its decision based on current
MonotonicTime. This is a greater level of flexibility than chrono allowed, since chrono
did not have the concept of a dynamic clock type.
This patch does not include conversions between std::chrono and these new time classes,
because past experience shows that we're quite bad at getting conversions between
std::chrono and anything else right. Also, I didn't need such conversion code because this
patch only converts code that transitively touches ParkingLot and Condition. It was easy
to get all of that code onto the new time classes.
* WTF.xcodeproj/project.pbxproj:
* wtf/AutomaticThread.cpp:
(WTF::AutomaticThread::start):
* wtf/CMakeLists.txt:
* wtf/ClockType.cpp: Added.
(WTF::printInternal):
* wtf/ClockType.h: Added.
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::waitFor):
(WTF::ConditionBase::wait):
(WTF::ConditionBase::waitUntilWallClockSeconds): Deleted.
(WTF::ConditionBase::waitUntilMonotonicClockSeconds): Deleted.
(WTF::ConditionBase::waitForSeconds): Deleted.
(WTF::ConditionBase::waitForSecondsImpl): Deleted.
(WTF::ConditionBase::waitForImpl): Deleted.
(WTF::ConditionBase::absoluteFromRelative): Deleted.
* wtf/CrossThreadQueue.h:
(WTF::CrossThreadQueue<DataType>::waitForMessage):
* wtf/CurrentTime.cpp:
(WTF::sleep):
* wtf/MessageQueue.h:
(WTF::MessageQueue::infiniteTime): Deleted.
* wtf/MonotonicTime.cpp: Added.
(WTF::MonotonicTime::now):
(WTF::MonotonicTime::approximateWallTime):
(WTF::MonotonicTime::dump):
(WTF::MonotonicTime::sleep):
* wtf/MonotonicTime.h: Added.
(WTF::MonotonicTime::MonotonicTime):
(WTF::MonotonicTime::fromRawDouble):
(WTF::MonotonicTime::infinity):
(WTF::MonotonicTime::secondsSinceEpoch):
(WTF::MonotonicTime::approximateMonotonicTime):
(WTF::MonotonicTime::operator bool):
(WTF::MonotonicTime::operator+):
(WTF::MonotonicTime::operator-):
(WTF::MonotonicTime::operator+=):
(WTF::MonotonicTime::operator-=):
(WTF::MonotonicTime::operator==):
(WTF::MonotonicTime::operator!=):
(WTF::MonotonicTime::operator<):
(WTF::MonotonicTime::operator>):
(WTF::MonotonicTime::operator<=):
(WTF::MonotonicTime::operator>=):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkCount):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
* wtf/Seconds.cpp: Added.
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::dump):
(WTF::Seconds::sleep):
* wtf/Seconds.h: Added.
(WTF::Seconds::Seconds):
(WTF::Seconds::value):
(WTF::Seconds::seconds):
(WTF::Seconds::milliseconds):
(WTF::Seconds::microseconds):
(WTF::Seconds::nanoseconds):
(WTF::Seconds::fromMilliseconds):
(WTF::Seconds::fromMicroseconds):
(WTF::Seconds::fromNanoseconds):
(WTF::Seconds::infinity):
(WTF::Seconds::operator bool):
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::operator*):
(WTF::Seconds::operator/):
(WTF::Seconds::operator+=):
(WTF::Seconds::operator-=):
(WTF::Seconds::operator*=):
(WTF::Seconds::operator/=):
(WTF::Seconds::operator==):
(WTF::Seconds::operator!=):
(WTF::Seconds::operator<):
(WTF::Seconds::operator>):
(WTF::Seconds::operator<=):
(WTF::Seconds::operator>=):
* wtf/TimeWithDynamicClockType.cpp: Added.
(WTF::TimeWithDynamicClockType::now):
(WTF::TimeWithDynamicClockType::nowWithSameClock):
(WTF::TimeWithDynamicClockType::wallTime):
(WTF::TimeWithDynamicClockType::monotonicTime):
(WTF::TimeWithDynamicClockType::approximateWallTime):
(WTF::TimeWithDynamicClockType::approximateMonotonicTime):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator<):
(WTF::TimeWithDynamicClockType::operator>):
(WTF::TimeWithDynamicClockType::operator<=):
(WTF::TimeWithDynamicClockType::operator>=):
(WTF::TimeWithDynamicClockType::dump):
(WTF::TimeWithDynamicClockType::sleep):
* wtf/TimeWithDynamicClockType.h: Added.
(WTF::TimeWithDynamicClockType::TimeWithDynamicClockType):
(WTF::TimeWithDynamicClockType::fromRawDouble):
(WTF::TimeWithDynamicClockType::secondsSinceEpoch):
(WTF::TimeWithDynamicClockType::clockType):
(WTF::TimeWithDynamicClockType::withSameClockAndRawDouble):
(WTF::TimeWithDynamicClockType::operator bool):
(WTF::TimeWithDynamicClockType::operator+):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator+=):
(WTF::TimeWithDynamicClockType::operator-=):
(WTF::TimeWithDynamicClockType::operator==):
(WTF::TimeWithDynamicClockType::operator!=):
* wtf/WallTime.cpp: Added.
(WTF::WallTime::now):
(WTF::WallTime::approximateMonotonicTime):
(WTF::WallTime::dump):
(WTF::WallTime::sleep):
* wtf/WallTime.h: Added.
(WTF::WallTime::WallTime):
(WTF::WallTime::fromRawDouble):
(WTF::WallTime::infinity):
(WTF::WallTime::secondsSinceEpoch):
(WTF::WallTime::approximateWallTime):
(WTF::WallTime::operator bool):
(WTF::WallTime::operator+):
(WTF::WallTime::operator-):
(WTF::WallTime::operator+=):
(WTF::WallTime::operator-=):
(WTF::WallTime::operator==):
(WTF::WallTime::operator!=):
(WTF::WallTime::operator<):
(WTF::WallTime::operator>):
(WTF::WallTime::operator<=):
(WTF::WallTime::operator>=):
* wtf/threads/BinarySemaphore.cpp:
(WTF::BinarySemaphore::wait):
* wtf/threads/BinarySemaphore.h:
Tools:
* TestWebKitAPI/CMakeLists.txt:
* TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:
* TestWebKitAPI/Tests/WTF/Condition.cpp:
(TestWebKitAPI::TEST):
* TestWebKitAPI/Tests/WTF/SynchronizedFixedQueue.cpp:
(TestWebKitAPI::ToUpperConverter::stopProducing):
(TestWebKitAPI::ToUpperConverter::stopConsuming):
* TestWebKitAPI/Tests/WTF/Time.cpp: Added.
(WTF::operator<<):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/182152@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208415 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-05 03:02:39 +00:00
me - > parkingCondition . wait ( me - > parkingLock ) ;
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
}
2015-08-13 20:42:11 +00:00
me - > address = nullptr ;
}
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
ParkResult result ;
result . wasUnparked = ! didDequeue ;
if ( ! didDequeue ) {
// If we were unparked then there should be a token.
result . token = me - > token ;
}
return result ;
2015-08-11 19:51:35 +00:00
}
2016-04-20 04:25:02 +00:00
NEVER_INLINE ParkingLot : : UnparkResult ParkingLot : : unparkOne ( const void * address )
2015-08-11 19:51:35 +00:00
{
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : unparking one. \n " ) ) ;
2016-04-20 04:25:02 +00:00
UnparkResult result ;
2015-08-11 19:51:35 +00:00
2016-09-13 15:59:25 +00:00
RefPtr < ThreadData > threadData ;
2016-04-20 04:25:02 +00:00
result . mayHaveMoreThreads = dequeue (
2015-08-11 19:51:35 +00:00
address ,
JSC should support SharedArrayBuffer
https://bugs.webkit.org/show_bug.cgi?id=163986
Reviewed by Keith Miller.
JSTests:
This adds our own test for the various corner cases of SharedArrayBuffer. This test is meant to
check all of the things that don't require concurrency.
* stress/SharedArrayBuffer.js: Added.
(checkAtomics):
(shouldFail):
(Symbol):
(runAtomic):
Source/JavaScriptCore:
This implements https://tc39.github.io/ecmascript_sharedmem/shmem.html.
There is now a new SharedArrayBuffer type. In the JS runtime, which includes typed array
types, the SharedArrayBuffer is a drop-in replacement for ArrayBuffer, even though they are
distinct types (new SharedArrayBuffer() instanceof ArrayBuffer == false and vice versa). The
DOM will not recognize SharedArrayBuffer, or any typed array that wraps it, to ensure safety.
This matches what other browsers intend to do, see
https://github.com/tc39/ecmascript_sharedmem/issues/38. API is provided for the DOM to opt
into SharedArrayBuffer. One notable place is postMessage, which will share the
SharedArrayBuffer's underlying data storage with other workers. This creates a pool of shared
memory that the workers can use to talk to each other.
There is also an Atomics object in global scope, which exposes sequentially consistent atomic
operations: add, and, compareExchange, exchange, load, or, store, sub, and xor. Additionally
it exposes a Atomics.isLockFree utility, which takes a byte amount and returns true or false.
Also there is Atomics.wake/wait, which neatly map to ParkingLot.
Accesses to typed arrays that wrap SharedArrayBuffer are optimized by JSC the same way as
always. I believe that DFG and B3 already obey the following memory model, which I believe is
a bit weaker than Cambridge and a bit stronger than what is being proposed for
SharedArrayBuffer. To predict a program's behavior under the B3 memory model, imagine the
space of all possible programs that would result from running an optimizer that adversarially
follows B3's transformation rules. B3 transformations are correct if the newly created
program is equivalent to the old one, assuming that any opaque effect in IR (like the reads
and writes of a patchpoint/call/fence) could perform any load/store that satisfies the
B3::Effects summary. Opaque effects are a way of describing an infinite set of programs: any
program that only does the effects summarized in B3::Effects belongs to the set. For example,
this prevents motion of operations across fences since fences are summarized as opaque
effects that could read or write memory. This rule alone is not enough, because it leaves the
door open for turning an atomic operation (like a load) into a non-atomic one (like a load
followed by a store of the same value back to the same location or multiple loads). This is
not an optimization that either our compiler or the CPU would want to do. One way to think of
what exactly is forbidden is that B3 transformations that mess with memory accesses can only
reorder them or remove them. This means that for any execution of the untransformed program,
the corresponding execution of the transformed program (i.e. with the same input arguments
and the same programs filled in for the opaque effects) must have the same loads and stores,
with some removed and some reordered. This is a fairly simple mental model that B3 and DFG
already follow and it's based on existing abstractions for the infinite set of programs
inside an opaque effect (DFG's AbstractHeaps and B3's Effects).
This patch makes all atomics operations intrinsic, but the DFG doesn't know about any of them
yet. That's covered by bug 164108.
This ought to be perf-neutral, but I am still running tests to confirm this. I'm also still
writing new tests to cover all of the Atomics functionality and the behavior of SAB objects.
* API/JSTypedArray.cpp:
(JSObjectGetTypedArrayBytesPtr):
(JSObjectGetTypedArrayBuffer):
(JSObjectMakeArrayBufferWithBytesNoCopy):
* API/tests/CompareAndSwapTest.cpp:
(Bitmap::concurrentTestAndSet):
* CMakeLists.txt:
* JavaScriptCore.xcodeproj/project.pbxproj:
* dfg/DFGDesiredWatchpoints.cpp:
(JSC::DFG::ArrayBufferViewWatchpointAdaptor::add):
* heap/Heap.cpp:
(JSC::Heap::reportExtraMemoryVisited):
(JSC::Heap::reportExternalMemoryVisited):
* jsc.cpp:
(functionTransferArrayBuffer):
* runtime/ArrayBuffer.cpp:
(JSC::SharedArrayBufferContents::SharedArrayBufferContents):
(JSC::SharedArrayBufferContents::~SharedArrayBufferContents):
(JSC::ArrayBufferContents::ArrayBufferContents):
(JSC::ArrayBufferContents::operator=):
(JSC::ArrayBufferContents::~ArrayBufferContents):
(JSC::ArrayBufferContents::clear):
(JSC::ArrayBufferContents::destroy):
(JSC::ArrayBufferContents::reset):
(JSC::ArrayBufferContents::tryAllocate):
(JSC::ArrayBufferContents::makeShared):
(JSC::ArrayBufferContents::transferTo):
(JSC::ArrayBufferContents::copyTo):
(JSC::ArrayBufferContents::shareWith):
(JSC::ArrayBuffer::create):
(JSC::ArrayBuffer::createAdopted):
(JSC::ArrayBuffer::createFromBytes):
(JSC::ArrayBuffer::tryCreate):
(JSC::ArrayBuffer::createUninitialized):
(JSC::ArrayBuffer::tryCreateUninitialized):
(JSC::ArrayBuffer::createInternal):
(JSC::ArrayBuffer::ArrayBuffer):
(JSC::ArrayBuffer::slice):
(JSC::ArrayBuffer::sliceImpl):
(JSC::ArrayBuffer::makeShared):
(JSC::ArrayBuffer::setSharingMode):
(JSC::ArrayBuffer::transferTo):
(JSC::ArrayBuffer::transfer): Deleted.
* runtime/ArrayBuffer.h:
(JSC::arrayBufferSharingModeName):
(JSC::SharedArrayBufferContents::data):
(JSC::ArrayBufferContents::data):
(JSC::ArrayBufferContents::sizeInBytes):
(JSC::ArrayBufferContents::isShared):
(JSC::ArrayBuffer::sharingMode):
(JSC::ArrayBuffer::isShared):
(JSC::ArrayBuffer::gcSizeEstimateInBytes):
(JSC::arrayBufferDestructorNull): Deleted.
(JSC::arrayBufferDestructorDefault): Deleted.
(JSC::ArrayBufferContents::ArrayBufferContents): Deleted.
(JSC::ArrayBufferContents::transfer): Deleted.
(JSC::ArrayBufferContents::copyTo): Deleted.
(JSC::ArrayBuffer::create): Deleted.
(JSC::ArrayBuffer::createAdopted): Deleted.
(JSC::ArrayBuffer::createFromBytes): Deleted.
(JSC::ArrayBuffer::tryCreate): Deleted.
(JSC::ArrayBuffer::createUninitialized): Deleted.
(JSC::ArrayBuffer::tryCreateUninitialized): Deleted.
(JSC::ArrayBuffer::createInternal): Deleted.
(JSC::ArrayBuffer::ArrayBuffer): Deleted.
(JSC::ArrayBuffer::slice): Deleted.
(JSC::ArrayBuffer::sliceImpl): Deleted.
(JSC::ArrayBufferContents::tryAllocate): Deleted.
(JSC::ArrayBufferContents::~ArrayBufferContents): Deleted.
* runtime/ArrayBufferSharingMode.h: Added.
* runtime/ArrayBufferView.h:
(JSC::ArrayBufferView::possiblySharedBuffer):
(JSC::ArrayBufferView::unsharedBuffer):
(JSC::ArrayBufferView::isShared):
(JSC::ArrayBufferView::buffer): Deleted.
* runtime/AtomicsObject.cpp: Added.
(JSC::AtomicsObject::AtomicsObject):
(JSC::AtomicsObject::create):
(JSC::AtomicsObject::createStructure):
(JSC::AtomicsObject::finishCreation):
(JSC::atomicsFuncAdd):
(JSC::atomicsFuncAnd):
(JSC::atomicsFuncCompareExchange):
(JSC::atomicsFuncExchange):
(JSC::atomicsFuncIsLockFree):
(JSC::atomicsFuncLoad):
(JSC::atomicsFuncOr):
(JSC::atomicsFuncStore):
(JSC::atomicsFuncSub):
(JSC::atomicsFuncWait):
(JSC::atomicsFuncWake):
(JSC::atomicsFuncXor):
* runtime/AtomicsObject.h: Added.
* runtime/CommonIdentifiers.h:
* runtime/DataView.cpp:
(JSC::DataView::wrap):
* runtime/GenericTypedArrayViewInlines.h:
(JSC::GenericTypedArrayView<Adaptor>::subarray):
* runtime/Intrinsic.h:
* runtime/JSArrayBuffer.cpp:
(JSC::JSArrayBuffer::finishCreation):
(JSC::JSArrayBuffer::isShared):
(JSC::JSArrayBuffer::sharingMode):
* runtime/JSArrayBuffer.h:
(JSC::toPossiblySharedArrayBuffer):
(JSC::toUnsharedArrayBuffer):
(JSC::JSArrayBuffer::toWrapped):
(JSC::toArrayBuffer): Deleted.
* runtime/JSArrayBufferConstructor.cpp:
(JSC::JSArrayBufferConstructor::JSArrayBufferConstructor):
(JSC::JSArrayBufferConstructor::finishCreation):
(JSC::JSArrayBufferConstructor::create):
(JSC::constructArrayBuffer):
* runtime/JSArrayBufferConstructor.h:
(JSC::JSArrayBufferConstructor::sharingMode):
* runtime/JSArrayBufferPrototype.cpp:
(JSC::arrayBufferProtoFuncSlice):
(JSC::JSArrayBufferPrototype::JSArrayBufferPrototype):
(JSC::JSArrayBufferPrototype::finishCreation):
(JSC::JSArrayBufferPrototype::create):
* runtime/JSArrayBufferPrototype.h:
* runtime/JSArrayBufferView.cpp:
(JSC::JSArrayBufferView::finishCreation):
(JSC::JSArrayBufferView::visitChildren):
(JSC::JSArrayBufferView::unsharedBuffer):
(JSC::JSArrayBufferView::unsharedJSBuffer):
(JSC::JSArrayBufferView::possiblySharedJSBuffer):
(JSC::JSArrayBufferView::neuter):
(JSC::JSArrayBufferView::toWrapped): Deleted.
* runtime/JSArrayBufferView.h:
(JSC::JSArrayBufferView::jsBuffer): Deleted.
* runtime/JSArrayBufferViewInlines.h:
(JSC::JSArrayBufferView::isShared):
(JSC::JSArrayBufferView::possiblySharedBuffer):
(JSC::JSArrayBufferView::possiblySharedImpl):
(JSC::JSArrayBufferView::unsharedImpl):
(JSC::JSArrayBufferView::byteOffset):
(JSC::JSArrayBufferView::toWrapped):
(JSC::JSArrayBufferView::buffer): Deleted.
(JSC::JSArrayBufferView::impl): Deleted.
(JSC::JSArrayBufferView::neuter): Deleted.
* runtime/JSDataView.cpp:
(JSC::JSDataView::possiblySharedTypedImpl):
(JSC::JSDataView::unsharedTypedImpl):
(JSC::JSDataView::getTypedArrayImpl):
(JSC::JSDataView::typedImpl): Deleted.
* runtime/JSDataView.h:
(JSC::JSDataView::possiblySharedBuffer):
(JSC::JSDataView::unsharedBuffer):
(JSC::JSDataView::buffer): Deleted.
* runtime/JSDataViewPrototype.cpp:
(JSC::dataViewProtoGetterBuffer):
* runtime/JSGenericTypedArrayView.h:
(JSC::toPossiblySharedNativeTypedView):
(JSC::toUnsharedNativeTypedView):
(JSC::JSGenericTypedArrayView<Adaptor>::toWrapped):
(JSC::JSGenericTypedArrayView::typedImpl): Deleted.
(JSC::toNativeTypedView): Deleted.
* runtime/JSGenericTypedArrayViewInlines.h:
(JSC::JSGenericTypedArrayView<Adaptor>::create):
(JSC::JSGenericTypedArrayView<Adaptor>::possiblySharedTypedImpl):
(JSC::JSGenericTypedArrayView<Adaptor>::unsharedTypedImpl):
(JSC::JSGenericTypedArrayView<Adaptor>::getTypedArrayImpl):
* runtime/JSGenericTypedArrayViewPrototypeFunctions.h:
(JSC::genericTypedArrayViewProtoGetterFuncBuffer):
(JSC::genericTypedArrayViewPrivateFuncSubarrayCreate):
* runtime/JSGlobalObject.cpp:
(JSC::createAtomicsProperty):
(JSC::JSGlobalObject::init):
(JSC::JSGlobalObject::visitChildren):
* runtime/JSGlobalObject.h:
(JSC::JSGlobalObject::arrayBufferPrototype):
(JSC::JSGlobalObject::arrayBufferStructure):
* runtime/MathObject.cpp:
* runtime/RuntimeFlags.h:
* runtime/SimpleTypedArrayController.cpp:
(JSC::SimpleTypedArrayController::toJS):
* runtime/TypedArrayType.h:
(JSC::typedArrayTypeForType):
Source/WebCore:
New tests added in the LayoutTests/workers/sab directory.
This teaches WebCore that a typed array could be shared or not. By default, WebCore will
reject shared typed arrays as if they were not typed arrays. This ensures that we don't get
race conditions in code that can't handle it.
If you postMessage a SharedArrayBuffer or something that wraps it, you will send the shared
memory to the other worker.
* Modules/encryptedmedia/CDMSessionClearKey.cpp:
(WebCore::CDMSessionClearKey::cachedKeyForKeyID):
* Modules/fetch/FetchBody.cpp:
(WebCore::FetchBody::extract):
* Modules/mediastream/RTCDataChannel.cpp:
(WebCore::RTCDataChannel::send):
* Modules/webaudio/AudioBuffer.cpp:
(WebCore::AudioBuffer::getChannelData):
* Modules/websockets/WebSocket.cpp:
(WebCore::WebSocket::send):
* bindings/js/JSBlobCustom.cpp:
(WebCore::constructJSBlob):
* bindings/js/JSCryptoAlgorithmDictionary.cpp:
(WebCore::createRsaKeyGenParams):
* bindings/js/JSCryptoCustom.cpp:
(WebCore::JSCrypto::getRandomValues):
* bindings/js/JSCryptoOperationData.cpp:
(WebCore::cryptoOperationDataFromJSValue):
* bindings/js/JSDOMBinding.h:
(WebCore::toJS):
(WebCore::toPossiblySharedArrayBufferView):
(WebCore::toUnsharedArrayBufferView):
(WebCore::toPossiblySharedInt8Array):
(WebCore::toPossiblySharedInt16Array):
(WebCore::toPossiblySharedInt32Array):
(WebCore::toPossiblySharedUint8Array):
(WebCore::toPossiblySharedUint8ClampedArray):
(WebCore::toPossiblySharedUint16Array):
(WebCore::toPossiblySharedUint32Array):
(WebCore::toPossiblySharedFloat32Array):
(WebCore::toPossiblySharedFloat64Array):
(WebCore::toUnsharedInt8Array):
(WebCore::toUnsharedInt16Array):
(WebCore::toUnsharedInt32Array):
(WebCore::toUnsharedUint8Array):
(WebCore::toUnsharedUint8ClampedArray):
(WebCore::toUnsharedUint16Array):
(WebCore::toUnsharedUint32Array):
(WebCore::toUnsharedFloat32Array):
(WebCore::toUnsharedFloat64Array):
(WebCore::toArrayBufferView): Deleted.
(WebCore::toInt8Array): Deleted.
(WebCore::toInt16Array): Deleted.
(WebCore::toInt32Array): Deleted.
(WebCore::toUint8Array): Deleted.
(WebCore::toUint8ClampedArray): Deleted.
(WebCore::toUint16Array): Deleted.
(WebCore::toUint32Array): Deleted.
(WebCore::toFloat32Array): Deleted.
(WebCore::toFloat64Array): Deleted.
* bindings/js/JSDataCueCustom.cpp:
(WebCore::constructJSDataCue):
* bindings/js/JSDictionary.cpp:
(WebCore::JSDictionary::convertValue):
* bindings/js/JSFileCustom.cpp:
(WebCore::constructJSFile):
* bindings/js/JSMessagePortCustom.cpp:
(WebCore::extractTransferables):
* bindings/js/JSWebGLRenderingContextBaseCustom.cpp:
(WebCore::dataFunctionf):
(WebCore::dataFunctioni):
(WebCore::dataFunctionMatrix):
* bindings/js/JSXMLHttpRequestCustom.cpp:
(WebCore::JSXMLHttpRequest::send):
* bindings/js/SerializedScriptValue.cpp:
(WebCore::CloneSerializer::dumpArrayBufferView):
(WebCore::CloneSerializer::dumpIfTerminal):
(WebCore::CloneDeserializer::readArrayBufferView):
(WebCore::CloneDeserializer::readTerminal):
(WebCore::SerializedScriptValue::transferArrayBuffers):
* bindings/js/StructuredClone.cpp:
(WebCore::structuredCloneArrayBuffer):
(WebCore::structuredCloneArrayBufferView):
* bindings/scripts/CodeGeneratorJS.pm:
(JSValueToNative):
* css/FontFace.cpp:
(WebCore::FontFace::create):
* html/canvas/WebGL2RenderingContext.cpp:
(WebCore::WebGL2RenderingContext::bufferData):
(WebCore::WebGL2RenderingContext::bufferSubData):
* platform/graphics/avfoundation/MediaPlayerPrivateAVFoundation.cpp:
(WebCore::MediaPlayerPrivateAVFoundation::extractKeyURIKeyIDAndCertificateFromInitData):
Source/WebKit/mac:
Support the RuntimeFlag.
* WebView/WebPreferencesPrivate.h:
Source/WebKit/win:
Support the RuntimeFlag.
* Interfaces/IWebPreferencesPrivate.idl:
Source/WebKit2:
Adds some small things we need for SharedArrayBuffer.
* UIProcess/API/C/WKPreferencesRefPrivate.h:
* UIProcess/API/Cocoa/WKPreferencesPrivate.h:
* WebProcess/InjectedBundle/InjectedBundle.cpp:
(WebKit::InjectedBundle::createWebDataFromUint8Array):
Source/WTF:
Adds some small things we need for SharedArrayBuffer.
* wtf/Atomics.h:
(WTF::Atomic::compareExchangeWeakRelaxed):
(WTF::Atomic::exchangeAdd):
(WTF::Atomic::exchangeAnd):
(WTF::Atomic::exchangeOr):
(WTF::Atomic::exchangeSub):
(WTF::Atomic::exchangeXor):
(WTF::atomicLoad):
(WTF::atomicStore):
(WTF::atomicCompareExchangeWeak):
(WTF::atomicCompareExchangeWeakRelaxed):
(WTF::atomicCompareExchangeStrong):
(WTF::atomicExchangeAdd):
(WTF::atomicExchangeAnd):
(WTF::atomicExchangeOr):
(WTF::atomicExchangeSub):
(WTF::atomicExchangeXor):
(WTF::atomicExchange):
(WTF::Atomic::exchangeAndAdd): Deleted.
(WTF::weakCompareAndSwap): Deleted.
We need to be able to do atomics operations on naked pointers. We also need to be able to do
all of the things that std::atomic does. This adds those things and renames
weakCompareAndSwap to atomicCompareExchangeWeakRelaxed so that we're using consistent
terminology.
* wtf/Bitmap.h:
(WTF::WordType>::concurrentTestAndSet): Renamed weakCompareAndSwap.
(WTF::WordType>::concurrentTestAndClear): Renamed weakCompareAndSwap.
* wtf/FastBitVector.h:
(WTF::FastBitVector::atomicSetAndCheck): Renamed weakCompareAndSwap.
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkCount):
* wtf/ParkingLot.h:
Added unparkCount(), which lets you unpark some bounded number of threads and returns the
number of threads unparked. This is just a modest extension of unparkAll(). unparkAll() now
just calls unparkCount(ptr, UINT_MAX).
Tools:
Use the right kind of typed array API.
* DumpRenderTree/TestRunner.cpp:
(setAudioResultCallback):
LayoutTests:
Adding tests. This is a work in progress.
* workers/sab: Added.
* workers/sab/simple-worker-1.js: Added.
(onmessage):
* workers/sab/simple-worker-2.js: Added.
(onmessage):
* workers/sab/simple.html: Added.
Canonical link: https://commits.webkit.org/181984@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208209 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-01 03:10:00 +00:00
// Why is this here?
// FIXME: It seems like this could be IgnoreEmpty, but I switched this to EnsureNonEmpty
// without explanation in r199760. We need it to use EnsureNonEmpty if we need to perform
// some operation while holding the bucket lock, which usually goes into the finish func.
// But if that operation is a no-op, then it's not clear why we need this.
2016-04-20 04:25:02 +00:00
BucketMode : : EnsureNonEmpty ,
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
[ & ] ( ThreadData * element , bool ) {
2015-08-11 19:51:35 +00:00
if ( element - > address ! = address )
return DequeueResult : : Ignore ;
threadData = element ;
2016-04-20 04:25:02 +00:00
result . didUnparkThread = true ;
2015-08-11 19:51:35 +00:00
return DequeueResult : : RemoveAndStop ;
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
} ,
[ ] ( bool ) { } ) ;
2015-08-11 19:51:35 +00:00
2016-04-20 04:25:02 +00:00
if ( ! threadData ) {
ASSERT ( ! result . didUnparkThread ) ;
result . mayHaveMoreThreads = false ;
return result ;
}
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
ASSERT ( threadData - > address ) ;
2015-08-11 19:51:35 +00:00
{
WTF::ParkingLot should stop using std::chrono because std::chrono::duration casts are prone to overflows
https://bugs.webkit.org/show_bug.cgi?id=152045
Reviewed by Andy Estes.
Source/JavaScriptCore:
Probably the nicest example of why this patch is a good idea is the change in
AtomicsObject.cpp.
* jit/ICStats.cpp:
(JSC::ICStats::ICStats):
* runtime/AtomicsObject.cpp:
(JSC::atomicsFuncWait):
Source/WebCore:
No new layout tests because no new behavior. The new WTF time classes have some unit tests
in TestWebKitAPI.
* fileapi/ThreadableBlobRegistry.cpp:
(WebCore::ThreadableBlobRegistry::blobSize):
* platform/MainThreadSharedTimer.h:
* platform/SharedTimer.h:
* platform/ThreadTimers.cpp:
(WebCore::ThreadTimers::updateSharedTimer):
* platform/cf/MainThreadSharedTimerCF.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/efl/MainThreadSharedTimerEfl.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/glib/MainThreadSharedTimerGLib.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/win/MainThreadSharedTimerWin.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* workers/WorkerRunLoop.cpp:
(WebCore::WorkerRunLoop::runInMode):
Source/WebKit2:
* Platform/IPC/Connection.cpp:
(IPC::Connection::SyncMessageState::wait):
(IPC::Connection::sendMessage):
(IPC::Connection::timeoutRespectingIgnoreTimeoutsForTesting):
(IPC::Connection::waitForMessage):
(IPC::Connection::sendSyncMessage):
(IPC::Connection::waitForSyncReply):
* Platform/IPC/Connection.h:
(IPC::Connection::sendSync):
(IPC::Connection::waitForAndDispatchImmediately):
* Platform/IPC/MessageSender.h:
(IPC::MessageSender::sendSync):
* UIProcess/ChildProcessProxy.h:
(WebKit::ChildProcessProxy::sendSync):
* UIProcess/Network/NetworkProcessProxy.cpp:
(WebKit::NetworkProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/Storage/StorageManager.cpp:
(WebKit::StorageManager::applicationWillTerminate):
* UIProcess/WebProcessProxy.cpp:
(WebKit::WebProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/WebResourceLoadStatisticsStore.cpp:
(WebKit::WebResourceLoadStatisticsStore::applicationWillTerminate):
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.h:
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.mm:
(-[WKOneShotDisplayLinkHandler displayLinkFired:]):
(WebKit::RemoteLayerTreeDrawingAreaProxy::commitLayerTree):
(WebKit::RemoteLayerTreeDrawingAreaProxy::didRefreshDisplay):
(WebKit::RemoteLayerTreeDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/TiledCoreAnimationDrawingAreaProxy.mm:
(WebKit::TiledCoreAnimationDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/WKImmediateActionController.mm:
(-[WKImmediateActionController immediateActionRecognizerWillBeginAnimation:]):
* UIProcess/mac/WebPageProxyMac.mm:
(WebKit::WebPageProxy::stringSelectionForPasteboard):
(WebKit::WebPageProxy::dataSelectionForPasteboard):
(WebKit::WebPageProxy::readSelectionFromPasteboard):
(WebKit::WebPageProxy::shouldDelayWindowOrderingForEvent):
(WebKit::WebPageProxy::acceptsFirstMouse):
* WebProcess/WebCoreSupport/WebChromeClient.cpp:
(WebKit::WebChromeClient::runBeforeUnloadConfirmPanel):
(WebKit::WebChromeClient::runJavaScriptAlert):
(WebKit::WebChromeClient::runJavaScriptConfirm):
(WebKit::WebChromeClient::runJavaScriptPrompt):
(WebKit::WebChromeClient::print):
(WebKit::WebChromeClient::exceededDatabaseQuota):
(WebKit::WebChromeClient::reachedApplicationCacheOriginQuota):
* WebProcess/WebCoreSupport/WebFrameLoaderClient.cpp:
(WebKit::WebFrameLoaderClient::dispatchDecidePolicyForResponse):
* WebProcess/WebPage/WebPage.cpp:
(WebKit::WebPage::postSynchronousMessageForTesting):
Source/WTF:
We used to use 'double' for all time measurements. Sometimes it was milliseconds,
sometimes it was seconds. Sometimes we measured a span of time, sometimes we spoke of time
since some epoch. When we spoke of time since epoch, we either used a monotonic clock or
a wall clock. The type - always 'double' - never told us what kind of time we had, even
though there were roughly six of them (sec interval, ms interval, sec since epoch on wall,
ms since epoch on wall, sec since epoch monotonic, ms since epoch monotonic).
At some point, we thought that it would be a good idea to replace these doubles with
std::chrono. But since replacing some things with std::chrono, we found it to be terribly
inconvenient:
- Outrageous API. I never want to say std::chrono::milliseconds(blah). I never want to say
std::chrono::steady_clock::timepoint. The syntax for duration_cast is ugly, and ideally
duration_cast would not even be a thing.
- No overflow protection. std::chrono uses integers by default and using anything else is
clumsy. But the integer math is done without regard for the rough edges of integer math,
so any cast between std::chrono types risks overflow. Any comparison risks overflow
because it may do conversions silently. We have even found bugs where some C++
implementations had more overflows than others, which ends up being a special kind of
hell. In many cases, the overflow also has nasal demons.
It's an error to represent time using integers. It would have been excusable back when
floating point math was not guaranteed to be supported on all platforms, but that would
have been a long time ago. Time is a continuous, infinite concept and it's a perfect fit
for floating point:
- Floating point preserves precision under multiplication in all but extreme cases, so
using floating point for time means that unit conversions are almost completely
lossless. This means that we don't have to think very hard about what units to use. In
this patch, we use seconds almost everywhere. We only convert at boundaries, like an API
boundary that wants something other than seconds.
- Floating point makes it easy to reason about infinity, which is something that time code
wants to do a lot. Example: when would you like to timeout? Infinity please! This is the
most elegant way of having an API support both a timeout variant and a no-timeout
variant.
- Floating point does well-understood things when math goes wrong, and these things are
pretty well optimized to match what a mathematician would do when computing with real
numbers represented using scientific notation with a finite number of significant
digits. This means that time math under floating point looks like normal math. On the
other hand, std::chrono time math looks like garbage because you have to always check
for multiple possible UB corners whenever you touch large integers. Integers that
represent time are very likely to be large and you don't have to do much to overflow
them. At this time, based on the number of bugs we have already seen due to chrono
overflows, I am not certain that we even understand what are all of the corner cases
that we should even check for.
This patch introduces a new set of timekeeping classes that are all based on double, and
all internally use seconds. These classes support algebraic typing. The classes are:
- Seconds: this is for measuring a duration.
- WallTime: time since epoch according to a wall clock (aka real time clock).
- MonotonicTime: time since epoch according to a monotonic clock.
- ClockType: enum that says either Wall or Monotonic.
- TimeWithDynamicClockType: a tuple of double and ClockType, which represents either a
wall time or a monotonic time.
All of these classes behave like C++ values and are cheap to copy around since they are
very nearly POD. This supports comprehensive conversions between the various time types.
Most of this is by way of algebra. Here are just some of the rules we recognize:
WallTime = WallTime + Seconds
Seconds = WallTime - WallTime
MonotonicTime = MonotonicTime + Seconds
etc...
We support negative, infinite, and NaN times because math.
We support conversions between MonotonicTime and WallTime, like:
WallTime wt = mt.approximateWallTime()
This is called this "approximate" because the only way to do it is to get the current time
on both clocks and convert relative to that.
Many of our APIs would be happy using whatever notion of time the user wanted to use. For
those APIs, which includes Condition and ParkingLot, we have TimeWithDynamicClockType. You
can automatically convert WallTime or MonotonicTime to TimeWithDynamicClockType. This
means that if you use a WallTime with Condition::waitUntil, then Condition's internal
logic for when it should wake up makes its decision based on the current WallTime - but if
you use MonotonicTime then waitUntil will make its decision based on current
MonotonicTime. This is a greater level of flexibility than chrono allowed, since chrono
did not have the concept of a dynamic clock type.
This patch does not include conversions between std::chrono and these new time classes,
because past experience shows that we're quite bad at getting conversions between
std::chrono and anything else right. Also, I didn't need such conversion code because this
patch only converts code that transitively touches ParkingLot and Condition. It was easy
to get all of that code onto the new time classes.
* WTF.xcodeproj/project.pbxproj:
* wtf/AutomaticThread.cpp:
(WTF::AutomaticThread::start):
* wtf/CMakeLists.txt:
* wtf/ClockType.cpp: Added.
(WTF::printInternal):
* wtf/ClockType.h: Added.
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::waitFor):
(WTF::ConditionBase::wait):
(WTF::ConditionBase::waitUntilWallClockSeconds): Deleted.
(WTF::ConditionBase::waitUntilMonotonicClockSeconds): Deleted.
(WTF::ConditionBase::waitForSeconds): Deleted.
(WTF::ConditionBase::waitForSecondsImpl): Deleted.
(WTF::ConditionBase::waitForImpl): Deleted.
(WTF::ConditionBase::absoluteFromRelative): Deleted.
* wtf/CrossThreadQueue.h:
(WTF::CrossThreadQueue<DataType>::waitForMessage):
* wtf/CurrentTime.cpp:
(WTF::sleep):
* wtf/MessageQueue.h:
(WTF::MessageQueue::infiniteTime): Deleted.
* wtf/MonotonicTime.cpp: Added.
(WTF::MonotonicTime::now):
(WTF::MonotonicTime::approximateWallTime):
(WTF::MonotonicTime::dump):
(WTF::MonotonicTime::sleep):
* wtf/MonotonicTime.h: Added.
(WTF::MonotonicTime::MonotonicTime):
(WTF::MonotonicTime::fromRawDouble):
(WTF::MonotonicTime::infinity):
(WTF::MonotonicTime::secondsSinceEpoch):
(WTF::MonotonicTime::approximateMonotonicTime):
(WTF::MonotonicTime::operator bool):
(WTF::MonotonicTime::operator+):
(WTF::MonotonicTime::operator-):
(WTF::MonotonicTime::operator+=):
(WTF::MonotonicTime::operator-=):
(WTF::MonotonicTime::operator==):
(WTF::MonotonicTime::operator!=):
(WTF::MonotonicTime::operator<):
(WTF::MonotonicTime::operator>):
(WTF::MonotonicTime::operator<=):
(WTF::MonotonicTime::operator>=):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkCount):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
* wtf/Seconds.cpp: Added.
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::dump):
(WTF::Seconds::sleep):
* wtf/Seconds.h: Added.
(WTF::Seconds::Seconds):
(WTF::Seconds::value):
(WTF::Seconds::seconds):
(WTF::Seconds::milliseconds):
(WTF::Seconds::microseconds):
(WTF::Seconds::nanoseconds):
(WTF::Seconds::fromMilliseconds):
(WTF::Seconds::fromMicroseconds):
(WTF::Seconds::fromNanoseconds):
(WTF::Seconds::infinity):
(WTF::Seconds::operator bool):
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::operator*):
(WTF::Seconds::operator/):
(WTF::Seconds::operator+=):
(WTF::Seconds::operator-=):
(WTF::Seconds::operator*=):
(WTF::Seconds::operator/=):
(WTF::Seconds::operator==):
(WTF::Seconds::operator!=):
(WTF::Seconds::operator<):
(WTF::Seconds::operator>):
(WTF::Seconds::operator<=):
(WTF::Seconds::operator>=):
* wtf/TimeWithDynamicClockType.cpp: Added.
(WTF::TimeWithDynamicClockType::now):
(WTF::TimeWithDynamicClockType::nowWithSameClock):
(WTF::TimeWithDynamicClockType::wallTime):
(WTF::TimeWithDynamicClockType::monotonicTime):
(WTF::TimeWithDynamicClockType::approximateWallTime):
(WTF::TimeWithDynamicClockType::approximateMonotonicTime):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator<):
(WTF::TimeWithDynamicClockType::operator>):
(WTF::TimeWithDynamicClockType::operator<=):
(WTF::TimeWithDynamicClockType::operator>=):
(WTF::TimeWithDynamicClockType::dump):
(WTF::TimeWithDynamicClockType::sleep):
* wtf/TimeWithDynamicClockType.h: Added.
(WTF::TimeWithDynamicClockType::TimeWithDynamicClockType):
(WTF::TimeWithDynamicClockType::fromRawDouble):
(WTF::TimeWithDynamicClockType::secondsSinceEpoch):
(WTF::TimeWithDynamicClockType::clockType):
(WTF::TimeWithDynamicClockType::withSameClockAndRawDouble):
(WTF::TimeWithDynamicClockType::operator bool):
(WTF::TimeWithDynamicClockType::operator+):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator+=):
(WTF::TimeWithDynamicClockType::operator-=):
(WTF::TimeWithDynamicClockType::operator==):
(WTF::TimeWithDynamicClockType::operator!=):
* wtf/WallTime.cpp: Added.
(WTF::WallTime::now):
(WTF::WallTime::approximateMonotonicTime):
(WTF::WallTime::dump):
(WTF::WallTime::sleep):
* wtf/WallTime.h: Added.
(WTF::WallTime::WallTime):
(WTF::WallTime::fromRawDouble):
(WTF::WallTime::infinity):
(WTF::WallTime::secondsSinceEpoch):
(WTF::WallTime::approximateWallTime):
(WTF::WallTime::operator bool):
(WTF::WallTime::operator+):
(WTF::WallTime::operator-):
(WTF::WallTime::operator+=):
(WTF::WallTime::operator-=):
(WTF::WallTime::operator==):
(WTF::WallTime::operator!=):
(WTF::WallTime::operator<):
(WTF::WallTime::operator>):
(WTF::WallTime::operator<=):
(WTF::WallTime::operator>=):
* wtf/threads/BinarySemaphore.cpp:
(WTF::BinarySemaphore::wait):
* wtf/threads/BinarySemaphore.h:
Tools:
* TestWebKitAPI/CMakeLists.txt:
* TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:
* TestWebKitAPI/Tests/WTF/Condition.cpp:
(TestWebKitAPI::TEST):
* TestWebKitAPI/Tests/WTF/SynchronizedFixedQueue.cpp:
(TestWebKitAPI::ToUpperConverter::stopProducing):
(TestWebKitAPI::ToUpperConverter::stopConsuming):
* TestWebKitAPI/Tests/WTF/Time.cpp: Added.
(WTF::operator<<):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/182152@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208415 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-05 03:02:39 +00:00
MutexLocker locker ( threadData - > parkingLock ) ;
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
threadData - > address = nullptr ;
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
threadData - > token = 0 ;
2015-08-11 19:51:35 +00:00
}
WTF::ParkingLot should stop using std::chrono because std::chrono::duration casts are prone to overflows
https://bugs.webkit.org/show_bug.cgi?id=152045
Reviewed by Andy Estes.
Source/JavaScriptCore:
Probably the nicest example of why this patch is a good idea is the change in
AtomicsObject.cpp.
* jit/ICStats.cpp:
(JSC::ICStats::ICStats):
* runtime/AtomicsObject.cpp:
(JSC::atomicsFuncWait):
Source/WebCore:
No new layout tests because no new behavior. The new WTF time classes have some unit tests
in TestWebKitAPI.
* fileapi/ThreadableBlobRegistry.cpp:
(WebCore::ThreadableBlobRegistry::blobSize):
* platform/MainThreadSharedTimer.h:
* platform/SharedTimer.h:
* platform/ThreadTimers.cpp:
(WebCore::ThreadTimers::updateSharedTimer):
* platform/cf/MainThreadSharedTimerCF.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/efl/MainThreadSharedTimerEfl.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/glib/MainThreadSharedTimerGLib.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/win/MainThreadSharedTimerWin.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* workers/WorkerRunLoop.cpp:
(WebCore::WorkerRunLoop::runInMode):
Source/WebKit2:
* Platform/IPC/Connection.cpp:
(IPC::Connection::SyncMessageState::wait):
(IPC::Connection::sendMessage):
(IPC::Connection::timeoutRespectingIgnoreTimeoutsForTesting):
(IPC::Connection::waitForMessage):
(IPC::Connection::sendSyncMessage):
(IPC::Connection::waitForSyncReply):
* Platform/IPC/Connection.h:
(IPC::Connection::sendSync):
(IPC::Connection::waitForAndDispatchImmediately):
* Platform/IPC/MessageSender.h:
(IPC::MessageSender::sendSync):
* UIProcess/ChildProcessProxy.h:
(WebKit::ChildProcessProxy::sendSync):
* UIProcess/Network/NetworkProcessProxy.cpp:
(WebKit::NetworkProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/Storage/StorageManager.cpp:
(WebKit::StorageManager::applicationWillTerminate):
* UIProcess/WebProcessProxy.cpp:
(WebKit::WebProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/WebResourceLoadStatisticsStore.cpp:
(WebKit::WebResourceLoadStatisticsStore::applicationWillTerminate):
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.h:
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.mm:
(-[WKOneShotDisplayLinkHandler displayLinkFired:]):
(WebKit::RemoteLayerTreeDrawingAreaProxy::commitLayerTree):
(WebKit::RemoteLayerTreeDrawingAreaProxy::didRefreshDisplay):
(WebKit::RemoteLayerTreeDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/TiledCoreAnimationDrawingAreaProxy.mm:
(WebKit::TiledCoreAnimationDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/WKImmediateActionController.mm:
(-[WKImmediateActionController immediateActionRecognizerWillBeginAnimation:]):
* UIProcess/mac/WebPageProxyMac.mm:
(WebKit::WebPageProxy::stringSelectionForPasteboard):
(WebKit::WebPageProxy::dataSelectionForPasteboard):
(WebKit::WebPageProxy::readSelectionFromPasteboard):
(WebKit::WebPageProxy::shouldDelayWindowOrderingForEvent):
(WebKit::WebPageProxy::acceptsFirstMouse):
* WebProcess/WebCoreSupport/WebChromeClient.cpp:
(WebKit::WebChromeClient::runBeforeUnloadConfirmPanel):
(WebKit::WebChromeClient::runJavaScriptAlert):
(WebKit::WebChromeClient::runJavaScriptConfirm):
(WebKit::WebChromeClient::runJavaScriptPrompt):
(WebKit::WebChromeClient::print):
(WebKit::WebChromeClient::exceededDatabaseQuota):
(WebKit::WebChromeClient::reachedApplicationCacheOriginQuota):
* WebProcess/WebCoreSupport/WebFrameLoaderClient.cpp:
(WebKit::WebFrameLoaderClient::dispatchDecidePolicyForResponse):
* WebProcess/WebPage/WebPage.cpp:
(WebKit::WebPage::postSynchronousMessageForTesting):
Source/WTF:
We used to use 'double' for all time measurements. Sometimes it was milliseconds,
sometimes it was seconds. Sometimes we measured a span of time, sometimes we spoke of time
since some epoch. When we spoke of time since epoch, we either used a monotonic clock or
a wall clock. The type - always 'double' - never told us what kind of time we had, even
though there were roughly six of them (sec interval, ms interval, sec since epoch on wall,
ms since epoch on wall, sec since epoch monotonic, ms since epoch monotonic).
At some point, we thought that it would be a good idea to replace these doubles with
std::chrono. But since replacing some things with std::chrono, we found it to be terribly
inconvenient:
- Outrageous API. I never want to say std::chrono::milliseconds(blah). I never want to say
std::chrono::steady_clock::timepoint. The syntax for duration_cast is ugly, and ideally
duration_cast would not even be a thing.
- No overflow protection. std::chrono uses integers by default and using anything else is
clumsy. But the integer math is done without regard for the rough edges of integer math,
so any cast between std::chrono types risks overflow. Any comparison risks overflow
because it may do conversions silently. We have even found bugs where some C++
implementations had more overflows than others, which ends up being a special kind of
hell. In many cases, the overflow also has nasal demons.
It's an error to represent time using integers. It would have been excusable back when
floating point math was not guaranteed to be supported on all platforms, but that would
have been a long time ago. Time is a continuous, infinite concept and it's a perfect fit
for floating point:
- Floating point preserves precision under multiplication in all but extreme cases, so
using floating point for time means that unit conversions are almost completely
lossless. This means that we don't have to think very hard about what units to use. In
this patch, we use seconds almost everywhere. We only convert at boundaries, like an API
boundary that wants something other than seconds.
- Floating point makes it easy to reason about infinity, which is something that time code
wants to do a lot. Example: when would you like to timeout? Infinity please! This is the
most elegant way of having an API support both a timeout variant and a no-timeout
variant.
- Floating point does well-understood things when math goes wrong, and these things are
pretty well optimized to match what a mathematician would do when computing with real
numbers represented using scientific notation with a finite number of significant
digits. This means that time math under floating point looks like normal math. On the
other hand, std::chrono time math looks like garbage because you have to always check
for multiple possible UB corners whenever you touch large integers. Integers that
represent time are very likely to be large and you don't have to do much to overflow
them. At this time, based on the number of bugs we have already seen due to chrono
overflows, I am not certain that we even understand what are all of the corner cases
that we should even check for.
This patch introduces a new set of timekeeping classes that are all based on double, and
all internally use seconds. These classes support algebraic typing. The classes are:
- Seconds: this is for measuring a duration.
- WallTime: time since epoch according to a wall clock (aka real time clock).
- MonotonicTime: time since epoch according to a monotonic clock.
- ClockType: enum that says either Wall or Monotonic.
- TimeWithDynamicClockType: a tuple of double and ClockType, which represents either a
wall time or a monotonic time.
All of these classes behave like C++ values and are cheap to copy around since they are
very nearly POD. This supports comprehensive conversions between the various time types.
Most of this is by way of algebra. Here are just some of the rules we recognize:
WallTime = WallTime + Seconds
Seconds = WallTime - WallTime
MonotonicTime = MonotonicTime + Seconds
etc...
We support negative, infinite, and NaN times because math.
We support conversions between MonotonicTime and WallTime, like:
WallTime wt = mt.approximateWallTime()
This is called this "approximate" because the only way to do it is to get the current time
on both clocks and convert relative to that.
Many of our APIs would be happy using whatever notion of time the user wanted to use. For
those APIs, which includes Condition and ParkingLot, we have TimeWithDynamicClockType. You
can automatically convert WallTime or MonotonicTime to TimeWithDynamicClockType. This
means that if you use a WallTime with Condition::waitUntil, then Condition's internal
logic for when it should wake up makes its decision based on the current WallTime - but if
you use MonotonicTime then waitUntil will make its decision based on current
MonotonicTime. This is a greater level of flexibility than chrono allowed, since chrono
did not have the concept of a dynamic clock type.
This patch does not include conversions between std::chrono and these new time classes,
because past experience shows that we're quite bad at getting conversions between
std::chrono and anything else right. Also, I didn't need such conversion code because this
patch only converts code that transitively touches ParkingLot and Condition. It was easy
to get all of that code onto the new time classes.
* WTF.xcodeproj/project.pbxproj:
* wtf/AutomaticThread.cpp:
(WTF::AutomaticThread::start):
* wtf/CMakeLists.txt:
* wtf/ClockType.cpp: Added.
(WTF::printInternal):
* wtf/ClockType.h: Added.
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::waitFor):
(WTF::ConditionBase::wait):
(WTF::ConditionBase::waitUntilWallClockSeconds): Deleted.
(WTF::ConditionBase::waitUntilMonotonicClockSeconds): Deleted.
(WTF::ConditionBase::waitForSeconds): Deleted.
(WTF::ConditionBase::waitForSecondsImpl): Deleted.
(WTF::ConditionBase::waitForImpl): Deleted.
(WTF::ConditionBase::absoluteFromRelative): Deleted.
* wtf/CrossThreadQueue.h:
(WTF::CrossThreadQueue<DataType>::waitForMessage):
* wtf/CurrentTime.cpp:
(WTF::sleep):
* wtf/MessageQueue.h:
(WTF::MessageQueue::infiniteTime): Deleted.
* wtf/MonotonicTime.cpp: Added.
(WTF::MonotonicTime::now):
(WTF::MonotonicTime::approximateWallTime):
(WTF::MonotonicTime::dump):
(WTF::MonotonicTime::sleep):
* wtf/MonotonicTime.h: Added.
(WTF::MonotonicTime::MonotonicTime):
(WTF::MonotonicTime::fromRawDouble):
(WTF::MonotonicTime::infinity):
(WTF::MonotonicTime::secondsSinceEpoch):
(WTF::MonotonicTime::approximateMonotonicTime):
(WTF::MonotonicTime::operator bool):
(WTF::MonotonicTime::operator+):
(WTF::MonotonicTime::operator-):
(WTF::MonotonicTime::operator+=):
(WTF::MonotonicTime::operator-=):
(WTF::MonotonicTime::operator==):
(WTF::MonotonicTime::operator!=):
(WTF::MonotonicTime::operator<):
(WTF::MonotonicTime::operator>):
(WTF::MonotonicTime::operator<=):
(WTF::MonotonicTime::operator>=):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkCount):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
* wtf/Seconds.cpp: Added.
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::dump):
(WTF::Seconds::sleep):
* wtf/Seconds.h: Added.
(WTF::Seconds::Seconds):
(WTF::Seconds::value):
(WTF::Seconds::seconds):
(WTF::Seconds::milliseconds):
(WTF::Seconds::microseconds):
(WTF::Seconds::nanoseconds):
(WTF::Seconds::fromMilliseconds):
(WTF::Seconds::fromMicroseconds):
(WTF::Seconds::fromNanoseconds):
(WTF::Seconds::infinity):
(WTF::Seconds::operator bool):
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::operator*):
(WTF::Seconds::operator/):
(WTF::Seconds::operator+=):
(WTF::Seconds::operator-=):
(WTF::Seconds::operator*=):
(WTF::Seconds::operator/=):
(WTF::Seconds::operator==):
(WTF::Seconds::operator!=):
(WTF::Seconds::operator<):
(WTF::Seconds::operator>):
(WTF::Seconds::operator<=):
(WTF::Seconds::operator>=):
* wtf/TimeWithDynamicClockType.cpp: Added.
(WTF::TimeWithDynamicClockType::now):
(WTF::TimeWithDynamicClockType::nowWithSameClock):
(WTF::TimeWithDynamicClockType::wallTime):
(WTF::TimeWithDynamicClockType::monotonicTime):
(WTF::TimeWithDynamicClockType::approximateWallTime):
(WTF::TimeWithDynamicClockType::approximateMonotonicTime):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator<):
(WTF::TimeWithDynamicClockType::operator>):
(WTF::TimeWithDynamicClockType::operator<=):
(WTF::TimeWithDynamicClockType::operator>=):
(WTF::TimeWithDynamicClockType::dump):
(WTF::TimeWithDynamicClockType::sleep):
* wtf/TimeWithDynamicClockType.h: Added.
(WTF::TimeWithDynamicClockType::TimeWithDynamicClockType):
(WTF::TimeWithDynamicClockType::fromRawDouble):
(WTF::TimeWithDynamicClockType::secondsSinceEpoch):
(WTF::TimeWithDynamicClockType::clockType):
(WTF::TimeWithDynamicClockType::withSameClockAndRawDouble):
(WTF::TimeWithDynamicClockType::operator bool):
(WTF::TimeWithDynamicClockType::operator+):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator+=):
(WTF::TimeWithDynamicClockType::operator-=):
(WTF::TimeWithDynamicClockType::operator==):
(WTF::TimeWithDynamicClockType::operator!=):
* wtf/WallTime.cpp: Added.
(WTF::WallTime::now):
(WTF::WallTime::approximateMonotonicTime):
(WTF::WallTime::dump):
(WTF::WallTime::sleep):
* wtf/WallTime.h: Added.
(WTF::WallTime::WallTime):
(WTF::WallTime::fromRawDouble):
(WTF::WallTime::infinity):
(WTF::WallTime::secondsSinceEpoch):
(WTF::WallTime::approximateWallTime):
(WTF::WallTime::operator bool):
(WTF::WallTime::operator+):
(WTF::WallTime::operator-):
(WTF::WallTime::operator+=):
(WTF::WallTime::operator-=):
(WTF::WallTime::operator==):
(WTF::WallTime::operator!=):
(WTF::WallTime::operator<):
(WTF::WallTime::operator>):
(WTF::WallTime::operator<=):
(WTF::WallTime::operator>=):
* wtf/threads/BinarySemaphore.cpp:
(WTF::BinarySemaphore::wait):
* wtf/threads/BinarySemaphore.h:
Tools:
* TestWebKitAPI/CMakeLists.txt:
* TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:
* TestWebKitAPI/Tests/WTF/Condition.cpp:
(TestWebKitAPI::TEST):
* TestWebKitAPI/Tests/WTF/SynchronizedFixedQueue.cpp:
(TestWebKitAPI::ToUpperConverter::stopProducing):
(TestWebKitAPI::ToUpperConverter::stopConsuming):
* TestWebKitAPI/Tests/WTF/Time.cpp: Added.
(WTF::operator<<):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/182152@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208415 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-05 03:02:39 +00:00
threadData - > parkingCondition . signal ( ) ;
2015-08-11 19:51:35 +00:00
return result ;
}
2016-05-26 21:58:42 +00:00
NEVER_INLINE void ParkingLot : : unparkOneImpl (
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
const void * address ,
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
const ScopedLambda < intptr_t ( ParkingLot : : UnparkResult ) > & callback )
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
{
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : unparking one the hard way. \n " ) ) ;
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
2016-09-13 15:59:25 +00:00
RefPtr < ThreadData > threadData ;
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
bool timeToBeFair = false ;
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
dequeue (
address ,
BucketMode : : EnsureNonEmpty ,
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
[ & ] ( ThreadData * element , bool passedTimeToBeFair ) {
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
if ( element - > address ! = address )
return DequeueResult : : Ignore ;
threadData = element ;
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
timeToBeFair = passedTimeToBeFair ;
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
return DequeueResult : : RemoveAndStop ;
} ,
[ & ] ( bool mayHaveMoreThreads ) {
2016-04-20 04:25:02 +00:00
UnparkResult result ;
result . didUnparkThread = ! ! threadData ;
result . mayHaveMoreThreads = result . didUnparkThread & & mayHaveMoreThreads ;
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
if ( timeToBeFair )
RELEASE_ASSERT ( threadData ) ;
result . timeToBeFair = timeToBeFair ;
intptr_t token = callback ( result ) ;
if ( threadData )
threadData - > token = token ;
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
} ) ;
if ( ! threadData )
return ;
ASSERT ( threadData - > address ) ;
{
WTF::ParkingLot should stop using std::chrono because std::chrono::duration casts are prone to overflows
https://bugs.webkit.org/show_bug.cgi?id=152045
Reviewed by Andy Estes.
Source/JavaScriptCore:
Probably the nicest example of why this patch is a good idea is the change in
AtomicsObject.cpp.
* jit/ICStats.cpp:
(JSC::ICStats::ICStats):
* runtime/AtomicsObject.cpp:
(JSC::atomicsFuncWait):
Source/WebCore:
No new layout tests because no new behavior. The new WTF time classes have some unit tests
in TestWebKitAPI.
* fileapi/ThreadableBlobRegistry.cpp:
(WebCore::ThreadableBlobRegistry::blobSize):
* platform/MainThreadSharedTimer.h:
* platform/SharedTimer.h:
* platform/ThreadTimers.cpp:
(WebCore::ThreadTimers::updateSharedTimer):
* platform/cf/MainThreadSharedTimerCF.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/efl/MainThreadSharedTimerEfl.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/glib/MainThreadSharedTimerGLib.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/win/MainThreadSharedTimerWin.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* workers/WorkerRunLoop.cpp:
(WebCore::WorkerRunLoop::runInMode):
Source/WebKit2:
* Platform/IPC/Connection.cpp:
(IPC::Connection::SyncMessageState::wait):
(IPC::Connection::sendMessage):
(IPC::Connection::timeoutRespectingIgnoreTimeoutsForTesting):
(IPC::Connection::waitForMessage):
(IPC::Connection::sendSyncMessage):
(IPC::Connection::waitForSyncReply):
* Platform/IPC/Connection.h:
(IPC::Connection::sendSync):
(IPC::Connection::waitForAndDispatchImmediately):
* Platform/IPC/MessageSender.h:
(IPC::MessageSender::sendSync):
* UIProcess/ChildProcessProxy.h:
(WebKit::ChildProcessProxy::sendSync):
* UIProcess/Network/NetworkProcessProxy.cpp:
(WebKit::NetworkProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/Storage/StorageManager.cpp:
(WebKit::StorageManager::applicationWillTerminate):
* UIProcess/WebProcessProxy.cpp:
(WebKit::WebProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/WebResourceLoadStatisticsStore.cpp:
(WebKit::WebResourceLoadStatisticsStore::applicationWillTerminate):
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.h:
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.mm:
(-[WKOneShotDisplayLinkHandler displayLinkFired:]):
(WebKit::RemoteLayerTreeDrawingAreaProxy::commitLayerTree):
(WebKit::RemoteLayerTreeDrawingAreaProxy::didRefreshDisplay):
(WebKit::RemoteLayerTreeDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/TiledCoreAnimationDrawingAreaProxy.mm:
(WebKit::TiledCoreAnimationDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/WKImmediateActionController.mm:
(-[WKImmediateActionController immediateActionRecognizerWillBeginAnimation:]):
* UIProcess/mac/WebPageProxyMac.mm:
(WebKit::WebPageProxy::stringSelectionForPasteboard):
(WebKit::WebPageProxy::dataSelectionForPasteboard):
(WebKit::WebPageProxy::readSelectionFromPasteboard):
(WebKit::WebPageProxy::shouldDelayWindowOrderingForEvent):
(WebKit::WebPageProxy::acceptsFirstMouse):
* WebProcess/WebCoreSupport/WebChromeClient.cpp:
(WebKit::WebChromeClient::runBeforeUnloadConfirmPanel):
(WebKit::WebChromeClient::runJavaScriptAlert):
(WebKit::WebChromeClient::runJavaScriptConfirm):
(WebKit::WebChromeClient::runJavaScriptPrompt):
(WebKit::WebChromeClient::print):
(WebKit::WebChromeClient::exceededDatabaseQuota):
(WebKit::WebChromeClient::reachedApplicationCacheOriginQuota):
* WebProcess/WebCoreSupport/WebFrameLoaderClient.cpp:
(WebKit::WebFrameLoaderClient::dispatchDecidePolicyForResponse):
* WebProcess/WebPage/WebPage.cpp:
(WebKit::WebPage::postSynchronousMessageForTesting):
Source/WTF:
We used to use 'double' for all time measurements. Sometimes it was milliseconds,
sometimes it was seconds. Sometimes we measured a span of time, sometimes we spoke of time
since some epoch. When we spoke of time since epoch, we either used a monotonic clock or
a wall clock. The type - always 'double' - never told us what kind of time we had, even
though there were roughly six of them (sec interval, ms interval, sec since epoch on wall,
ms since epoch on wall, sec since epoch monotonic, ms since epoch monotonic).
At some point, we thought that it would be a good idea to replace these doubles with
std::chrono. But since replacing some things with std::chrono, we found it to be terribly
inconvenient:
- Outrageous API. I never want to say std::chrono::milliseconds(blah). I never want to say
std::chrono::steady_clock::timepoint. The syntax for duration_cast is ugly, and ideally
duration_cast would not even be a thing.
- No overflow protection. std::chrono uses integers by default and using anything else is
clumsy. But the integer math is done without regard for the rough edges of integer math,
so any cast between std::chrono types risks overflow. Any comparison risks overflow
because it may do conversions silently. We have even found bugs where some C++
implementations had more overflows than others, which ends up being a special kind of
hell. In many cases, the overflow also has nasal demons.
It's an error to represent time using integers. It would have been excusable back when
floating point math was not guaranteed to be supported on all platforms, but that would
have been a long time ago. Time is a continuous, infinite concept and it's a perfect fit
for floating point:
- Floating point preserves precision under multiplication in all but extreme cases, so
using floating point for time means that unit conversions are almost completely
lossless. This means that we don't have to think very hard about what units to use. In
this patch, we use seconds almost everywhere. We only convert at boundaries, like an API
boundary that wants something other than seconds.
- Floating point makes it easy to reason about infinity, which is something that time code
wants to do a lot. Example: when would you like to timeout? Infinity please! This is the
most elegant way of having an API support both a timeout variant and a no-timeout
variant.
- Floating point does well-understood things when math goes wrong, and these things are
pretty well optimized to match what a mathematician would do when computing with real
numbers represented using scientific notation with a finite number of significant
digits. This means that time math under floating point looks like normal math. On the
other hand, std::chrono time math looks like garbage because you have to always check
for multiple possible UB corners whenever you touch large integers. Integers that
represent time are very likely to be large and you don't have to do much to overflow
them. At this time, based on the number of bugs we have already seen due to chrono
overflows, I am not certain that we even understand what are all of the corner cases
that we should even check for.
This patch introduces a new set of timekeeping classes that are all based on double, and
all internally use seconds. These classes support algebraic typing. The classes are:
- Seconds: this is for measuring a duration.
- WallTime: time since epoch according to a wall clock (aka real time clock).
- MonotonicTime: time since epoch according to a monotonic clock.
- ClockType: enum that says either Wall or Monotonic.
- TimeWithDynamicClockType: a tuple of double and ClockType, which represents either a
wall time or a monotonic time.
All of these classes behave like C++ values and are cheap to copy around since they are
very nearly POD. This supports comprehensive conversions between the various time types.
Most of this is by way of algebra. Here are just some of the rules we recognize:
WallTime = WallTime + Seconds
Seconds = WallTime - WallTime
MonotonicTime = MonotonicTime + Seconds
etc...
We support negative, infinite, and NaN times because math.
We support conversions between MonotonicTime and WallTime, like:
WallTime wt = mt.approximateWallTime()
This is called this "approximate" because the only way to do it is to get the current time
on both clocks and convert relative to that.
Many of our APIs would be happy using whatever notion of time the user wanted to use. For
those APIs, which includes Condition and ParkingLot, we have TimeWithDynamicClockType. You
can automatically convert WallTime or MonotonicTime to TimeWithDynamicClockType. This
means that if you use a WallTime with Condition::waitUntil, then Condition's internal
logic for when it should wake up makes its decision based on the current WallTime - but if
you use MonotonicTime then waitUntil will make its decision based on current
MonotonicTime. This is a greater level of flexibility than chrono allowed, since chrono
did not have the concept of a dynamic clock type.
This patch does not include conversions between std::chrono and these new time classes,
because past experience shows that we're quite bad at getting conversions between
std::chrono and anything else right. Also, I didn't need such conversion code because this
patch only converts code that transitively touches ParkingLot and Condition. It was easy
to get all of that code onto the new time classes.
* WTF.xcodeproj/project.pbxproj:
* wtf/AutomaticThread.cpp:
(WTF::AutomaticThread::start):
* wtf/CMakeLists.txt:
* wtf/ClockType.cpp: Added.
(WTF::printInternal):
* wtf/ClockType.h: Added.
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::waitFor):
(WTF::ConditionBase::wait):
(WTF::ConditionBase::waitUntilWallClockSeconds): Deleted.
(WTF::ConditionBase::waitUntilMonotonicClockSeconds): Deleted.
(WTF::ConditionBase::waitForSeconds): Deleted.
(WTF::ConditionBase::waitForSecondsImpl): Deleted.
(WTF::ConditionBase::waitForImpl): Deleted.
(WTF::ConditionBase::absoluteFromRelative): Deleted.
* wtf/CrossThreadQueue.h:
(WTF::CrossThreadQueue<DataType>::waitForMessage):
* wtf/CurrentTime.cpp:
(WTF::sleep):
* wtf/MessageQueue.h:
(WTF::MessageQueue::infiniteTime): Deleted.
* wtf/MonotonicTime.cpp: Added.
(WTF::MonotonicTime::now):
(WTF::MonotonicTime::approximateWallTime):
(WTF::MonotonicTime::dump):
(WTF::MonotonicTime::sleep):
* wtf/MonotonicTime.h: Added.
(WTF::MonotonicTime::MonotonicTime):
(WTF::MonotonicTime::fromRawDouble):
(WTF::MonotonicTime::infinity):
(WTF::MonotonicTime::secondsSinceEpoch):
(WTF::MonotonicTime::approximateMonotonicTime):
(WTF::MonotonicTime::operator bool):
(WTF::MonotonicTime::operator+):
(WTF::MonotonicTime::operator-):
(WTF::MonotonicTime::operator+=):
(WTF::MonotonicTime::operator-=):
(WTF::MonotonicTime::operator==):
(WTF::MonotonicTime::operator!=):
(WTF::MonotonicTime::operator<):
(WTF::MonotonicTime::operator>):
(WTF::MonotonicTime::operator<=):
(WTF::MonotonicTime::operator>=):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkCount):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
* wtf/Seconds.cpp: Added.
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::dump):
(WTF::Seconds::sleep):
* wtf/Seconds.h: Added.
(WTF::Seconds::Seconds):
(WTF::Seconds::value):
(WTF::Seconds::seconds):
(WTF::Seconds::milliseconds):
(WTF::Seconds::microseconds):
(WTF::Seconds::nanoseconds):
(WTF::Seconds::fromMilliseconds):
(WTF::Seconds::fromMicroseconds):
(WTF::Seconds::fromNanoseconds):
(WTF::Seconds::infinity):
(WTF::Seconds::operator bool):
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::operator*):
(WTF::Seconds::operator/):
(WTF::Seconds::operator+=):
(WTF::Seconds::operator-=):
(WTF::Seconds::operator*=):
(WTF::Seconds::operator/=):
(WTF::Seconds::operator==):
(WTF::Seconds::operator!=):
(WTF::Seconds::operator<):
(WTF::Seconds::operator>):
(WTF::Seconds::operator<=):
(WTF::Seconds::operator>=):
* wtf/TimeWithDynamicClockType.cpp: Added.
(WTF::TimeWithDynamicClockType::now):
(WTF::TimeWithDynamicClockType::nowWithSameClock):
(WTF::TimeWithDynamicClockType::wallTime):
(WTF::TimeWithDynamicClockType::monotonicTime):
(WTF::TimeWithDynamicClockType::approximateWallTime):
(WTF::TimeWithDynamicClockType::approximateMonotonicTime):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator<):
(WTF::TimeWithDynamicClockType::operator>):
(WTF::TimeWithDynamicClockType::operator<=):
(WTF::TimeWithDynamicClockType::operator>=):
(WTF::TimeWithDynamicClockType::dump):
(WTF::TimeWithDynamicClockType::sleep):
* wtf/TimeWithDynamicClockType.h: Added.
(WTF::TimeWithDynamicClockType::TimeWithDynamicClockType):
(WTF::TimeWithDynamicClockType::fromRawDouble):
(WTF::TimeWithDynamicClockType::secondsSinceEpoch):
(WTF::TimeWithDynamicClockType::clockType):
(WTF::TimeWithDynamicClockType::withSameClockAndRawDouble):
(WTF::TimeWithDynamicClockType::operator bool):
(WTF::TimeWithDynamicClockType::operator+):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator+=):
(WTF::TimeWithDynamicClockType::operator-=):
(WTF::TimeWithDynamicClockType::operator==):
(WTF::TimeWithDynamicClockType::operator!=):
* wtf/WallTime.cpp: Added.
(WTF::WallTime::now):
(WTF::WallTime::approximateMonotonicTime):
(WTF::WallTime::dump):
(WTF::WallTime::sleep):
* wtf/WallTime.h: Added.
(WTF::WallTime::WallTime):
(WTF::WallTime::fromRawDouble):
(WTF::WallTime::infinity):
(WTF::WallTime::secondsSinceEpoch):
(WTF::WallTime::approximateWallTime):
(WTF::WallTime::operator bool):
(WTF::WallTime::operator+):
(WTF::WallTime::operator-):
(WTF::WallTime::operator+=):
(WTF::WallTime::operator-=):
(WTF::WallTime::operator==):
(WTF::WallTime::operator!=):
(WTF::WallTime::operator<):
(WTF::WallTime::operator>):
(WTF::WallTime::operator<=):
(WTF::WallTime::operator>=):
* wtf/threads/BinarySemaphore.cpp:
(WTF::BinarySemaphore::wait):
* wtf/threads/BinarySemaphore.h:
Tools:
* TestWebKitAPI/CMakeLists.txt:
* TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:
* TestWebKitAPI/Tests/WTF/Condition.cpp:
(TestWebKitAPI::TEST):
* TestWebKitAPI/Tests/WTF/SynchronizedFixedQueue.cpp:
(TestWebKitAPI::ToUpperConverter::stopProducing):
(TestWebKitAPI::ToUpperConverter::stopConsuming):
* TestWebKitAPI/Tests/WTF/Time.cpp: Added.
(WTF::operator<<):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/182152@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208415 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-05 03:02:39 +00:00
MutexLocker locker ( threadData - > parkingLock ) ;
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
threadData - > address = nullptr ;
}
REGRESSION (r207480): 3 Dromaeo tests failing
https://bugs.webkit.org/show_bug.cgi?id=163633
Reviewed by Mark Lam.
It's a ParkingLot bug: if we timeout and get unparked at the same time, then the unparking
thread will clear our address eventually - but not immediately. This causes a nasty
assertion failure. The tricky thing is that when we detect this, we need to wait until that
unparking thread does the deed. Otherwise, they will still do it at some later time.
Alternatively, we could use some kind of versioning to detect this - increment the version
when you park, so that unparking threads will know if they are time travelers. That seems
more yucky.
I don't think it matters too much what we do here, so long as it's simple and correct, since
this requires a race that is rare enough that it didn't happen once in my testing, and I was
pretty thorough. For example, it didn't happen once in 15 runs of JetStream. The race is
rare because it requires the timeout to happen right as someone else is unparking. Since
it's so rare, its probably OK that the unparked thread waits just a tiny moment until the
unparking thread is done.
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOneImpl):
Canonical link: https://commits.webkit.org/181485@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@207577 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-10-19 23:37:20 +00:00
// At this point, the threadData may die. Good thing we have a RefPtr<> on it.
WTF::ParkingLot should stop using std::chrono because std::chrono::duration casts are prone to overflows
https://bugs.webkit.org/show_bug.cgi?id=152045
Reviewed by Andy Estes.
Source/JavaScriptCore:
Probably the nicest example of why this patch is a good idea is the change in
AtomicsObject.cpp.
* jit/ICStats.cpp:
(JSC::ICStats::ICStats):
* runtime/AtomicsObject.cpp:
(JSC::atomicsFuncWait):
Source/WebCore:
No new layout tests because no new behavior. The new WTF time classes have some unit tests
in TestWebKitAPI.
* fileapi/ThreadableBlobRegistry.cpp:
(WebCore::ThreadableBlobRegistry::blobSize):
* platform/MainThreadSharedTimer.h:
* platform/SharedTimer.h:
* platform/ThreadTimers.cpp:
(WebCore::ThreadTimers::updateSharedTimer):
* platform/cf/MainThreadSharedTimerCF.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/efl/MainThreadSharedTimerEfl.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/glib/MainThreadSharedTimerGLib.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/win/MainThreadSharedTimerWin.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* workers/WorkerRunLoop.cpp:
(WebCore::WorkerRunLoop::runInMode):
Source/WebKit2:
* Platform/IPC/Connection.cpp:
(IPC::Connection::SyncMessageState::wait):
(IPC::Connection::sendMessage):
(IPC::Connection::timeoutRespectingIgnoreTimeoutsForTesting):
(IPC::Connection::waitForMessage):
(IPC::Connection::sendSyncMessage):
(IPC::Connection::waitForSyncReply):
* Platform/IPC/Connection.h:
(IPC::Connection::sendSync):
(IPC::Connection::waitForAndDispatchImmediately):
* Platform/IPC/MessageSender.h:
(IPC::MessageSender::sendSync):
* UIProcess/ChildProcessProxy.h:
(WebKit::ChildProcessProxy::sendSync):
* UIProcess/Network/NetworkProcessProxy.cpp:
(WebKit::NetworkProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/Storage/StorageManager.cpp:
(WebKit::StorageManager::applicationWillTerminate):
* UIProcess/WebProcessProxy.cpp:
(WebKit::WebProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/WebResourceLoadStatisticsStore.cpp:
(WebKit::WebResourceLoadStatisticsStore::applicationWillTerminate):
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.h:
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.mm:
(-[WKOneShotDisplayLinkHandler displayLinkFired:]):
(WebKit::RemoteLayerTreeDrawingAreaProxy::commitLayerTree):
(WebKit::RemoteLayerTreeDrawingAreaProxy::didRefreshDisplay):
(WebKit::RemoteLayerTreeDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/TiledCoreAnimationDrawingAreaProxy.mm:
(WebKit::TiledCoreAnimationDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/WKImmediateActionController.mm:
(-[WKImmediateActionController immediateActionRecognizerWillBeginAnimation:]):
* UIProcess/mac/WebPageProxyMac.mm:
(WebKit::WebPageProxy::stringSelectionForPasteboard):
(WebKit::WebPageProxy::dataSelectionForPasteboard):
(WebKit::WebPageProxy::readSelectionFromPasteboard):
(WebKit::WebPageProxy::shouldDelayWindowOrderingForEvent):
(WebKit::WebPageProxy::acceptsFirstMouse):
* WebProcess/WebCoreSupport/WebChromeClient.cpp:
(WebKit::WebChromeClient::runBeforeUnloadConfirmPanel):
(WebKit::WebChromeClient::runJavaScriptAlert):
(WebKit::WebChromeClient::runJavaScriptConfirm):
(WebKit::WebChromeClient::runJavaScriptPrompt):
(WebKit::WebChromeClient::print):
(WebKit::WebChromeClient::exceededDatabaseQuota):
(WebKit::WebChromeClient::reachedApplicationCacheOriginQuota):
* WebProcess/WebCoreSupport/WebFrameLoaderClient.cpp:
(WebKit::WebFrameLoaderClient::dispatchDecidePolicyForResponse):
* WebProcess/WebPage/WebPage.cpp:
(WebKit::WebPage::postSynchronousMessageForTesting):
Source/WTF:
We used to use 'double' for all time measurements. Sometimes it was milliseconds,
sometimes it was seconds. Sometimes we measured a span of time, sometimes we spoke of time
since some epoch. When we spoke of time since epoch, we either used a monotonic clock or
a wall clock. The type - always 'double' - never told us what kind of time we had, even
though there were roughly six of them (sec interval, ms interval, sec since epoch on wall,
ms since epoch on wall, sec since epoch monotonic, ms since epoch monotonic).
At some point, we thought that it would be a good idea to replace these doubles with
std::chrono. But since replacing some things with std::chrono, we found it to be terribly
inconvenient:
- Outrageous API. I never want to say std::chrono::milliseconds(blah). I never want to say
std::chrono::steady_clock::timepoint. The syntax for duration_cast is ugly, and ideally
duration_cast would not even be a thing.
- No overflow protection. std::chrono uses integers by default and using anything else is
clumsy. But the integer math is done without regard for the rough edges of integer math,
so any cast between std::chrono types risks overflow. Any comparison risks overflow
because it may do conversions silently. We have even found bugs where some C++
implementations had more overflows than others, which ends up being a special kind of
hell. In many cases, the overflow also has nasal demons.
It's an error to represent time using integers. It would have been excusable back when
floating point math was not guaranteed to be supported on all platforms, but that would
have been a long time ago. Time is a continuous, infinite concept and it's a perfect fit
for floating point:
- Floating point preserves precision under multiplication in all but extreme cases, so
using floating point for time means that unit conversions are almost completely
lossless. This means that we don't have to think very hard about what units to use. In
this patch, we use seconds almost everywhere. We only convert at boundaries, like an API
boundary that wants something other than seconds.
- Floating point makes it easy to reason about infinity, which is something that time code
wants to do a lot. Example: when would you like to timeout? Infinity please! This is the
most elegant way of having an API support both a timeout variant and a no-timeout
variant.
- Floating point does well-understood things when math goes wrong, and these things are
pretty well optimized to match what a mathematician would do when computing with real
numbers represented using scientific notation with a finite number of significant
digits. This means that time math under floating point looks like normal math. On the
other hand, std::chrono time math looks like garbage because you have to always check
for multiple possible UB corners whenever you touch large integers. Integers that
represent time are very likely to be large and you don't have to do much to overflow
them. At this time, based on the number of bugs we have already seen due to chrono
overflows, I am not certain that we even understand what are all of the corner cases
that we should even check for.
This patch introduces a new set of timekeeping classes that are all based on double, and
all internally use seconds. These classes support algebraic typing. The classes are:
- Seconds: this is for measuring a duration.
- WallTime: time since epoch according to a wall clock (aka real time clock).
- MonotonicTime: time since epoch according to a monotonic clock.
- ClockType: enum that says either Wall or Monotonic.
- TimeWithDynamicClockType: a tuple of double and ClockType, which represents either a
wall time or a monotonic time.
All of these classes behave like C++ values and are cheap to copy around since they are
very nearly POD. This supports comprehensive conversions between the various time types.
Most of this is by way of algebra. Here are just some of the rules we recognize:
WallTime = WallTime + Seconds
Seconds = WallTime - WallTime
MonotonicTime = MonotonicTime + Seconds
etc...
We support negative, infinite, and NaN times because math.
We support conversions between MonotonicTime and WallTime, like:
WallTime wt = mt.approximateWallTime()
This is called this "approximate" because the only way to do it is to get the current time
on both clocks and convert relative to that.
Many of our APIs would be happy using whatever notion of time the user wanted to use. For
those APIs, which includes Condition and ParkingLot, we have TimeWithDynamicClockType. You
can automatically convert WallTime or MonotonicTime to TimeWithDynamicClockType. This
means that if you use a WallTime with Condition::waitUntil, then Condition's internal
logic for when it should wake up makes its decision based on the current WallTime - but if
you use MonotonicTime then waitUntil will make its decision based on current
MonotonicTime. This is a greater level of flexibility than chrono allowed, since chrono
did not have the concept of a dynamic clock type.
This patch does not include conversions between std::chrono and these new time classes,
because past experience shows that we're quite bad at getting conversions between
std::chrono and anything else right. Also, I didn't need such conversion code because this
patch only converts code that transitively touches ParkingLot and Condition. It was easy
to get all of that code onto the new time classes.
* WTF.xcodeproj/project.pbxproj:
* wtf/AutomaticThread.cpp:
(WTF::AutomaticThread::start):
* wtf/CMakeLists.txt:
* wtf/ClockType.cpp: Added.
(WTF::printInternal):
* wtf/ClockType.h: Added.
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::waitFor):
(WTF::ConditionBase::wait):
(WTF::ConditionBase::waitUntilWallClockSeconds): Deleted.
(WTF::ConditionBase::waitUntilMonotonicClockSeconds): Deleted.
(WTF::ConditionBase::waitForSeconds): Deleted.
(WTF::ConditionBase::waitForSecondsImpl): Deleted.
(WTF::ConditionBase::waitForImpl): Deleted.
(WTF::ConditionBase::absoluteFromRelative): Deleted.
* wtf/CrossThreadQueue.h:
(WTF::CrossThreadQueue<DataType>::waitForMessage):
* wtf/CurrentTime.cpp:
(WTF::sleep):
* wtf/MessageQueue.h:
(WTF::MessageQueue::infiniteTime): Deleted.
* wtf/MonotonicTime.cpp: Added.
(WTF::MonotonicTime::now):
(WTF::MonotonicTime::approximateWallTime):
(WTF::MonotonicTime::dump):
(WTF::MonotonicTime::sleep):
* wtf/MonotonicTime.h: Added.
(WTF::MonotonicTime::MonotonicTime):
(WTF::MonotonicTime::fromRawDouble):
(WTF::MonotonicTime::infinity):
(WTF::MonotonicTime::secondsSinceEpoch):
(WTF::MonotonicTime::approximateMonotonicTime):
(WTF::MonotonicTime::operator bool):
(WTF::MonotonicTime::operator+):
(WTF::MonotonicTime::operator-):
(WTF::MonotonicTime::operator+=):
(WTF::MonotonicTime::operator-=):
(WTF::MonotonicTime::operator==):
(WTF::MonotonicTime::operator!=):
(WTF::MonotonicTime::operator<):
(WTF::MonotonicTime::operator>):
(WTF::MonotonicTime::operator<=):
(WTF::MonotonicTime::operator>=):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkCount):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
* wtf/Seconds.cpp: Added.
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::dump):
(WTF::Seconds::sleep):
* wtf/Seconds.h: Added.
(WTF::Seconds::Seconds):
(WTF::Seconds::value):
(WTF::Seconds::seconds):
(WTF::Seconds::milliseconds):
(WTF::Seconds::microseconds):
(WTF::Seconds::nanoseconds):
(WTF::Seconds::fromMilliseconds):
(WTF::Seconds::fromMicroseconds):
(WTF::Seconds::fromNanoseconds):
(WTF::Seconds::infinity):
(WTF::Seconds::operator bool):
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::operator*):
(WTF::Seconds::operator/):
(WTF::Seconds::operator+=):
(WTF::Seconds::operator-=):
(WTF::Seconds::operator*=):
(WTF::Seconds::operator/=):
(WTF::Seconds::operator==):
(WTF::Seconds::operator!=):
(WTF::Seconds::operator<):
(WTF::Seconds::operator>):
(WTF::Seconds::operator<=):
(WTF::Seconds::operator>=):
* wtf/TimeWithDynamicClockType.cpp: Added.
(WTF::TimeWithDynamicClockType::now):
(WTF::TimeWithDynamicClockType::nowWithSameClock):
(WTF::TimeWithDynamicClockType::wallTime):
(WTF::TimeWithDynamicClockType::monotonicTime):
(WTF::TimeWithDynamicClockType::approximateWallTime):
(WTF::TimeWithDynamicClockType::approximateMonotonicTime):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator<):
(WTF::TimeWithDynamicClockType::operator>):
(WTF::TimeWithDynamicClockType::operator<=):
(WTF::TimeWithDynamicClockType::operator>=):
(WTF::TimeWithDynamicClockType::dump):
(WTF::TimeWithDynamicClockType::sleep):
* wtf/TimeWithDynamicClockType.h: Added.
(WTF::TimeWithDynamicClockType::TimeWithDynamicClockType):
(WTF::TimeWithDynamicClockType::fromRawDouble):
(WTF::TimeWithDynamicClockType::secondsSinceEpoch):
(WTF::TimeWithDynamicClockType::clockType):
(WTF::TimeWithDynamicClockType::withSameClockAndRawDouble):
(WTF::TimeWithDynamicClockType::operator bool):
(WTF::TimeWithDynamicClockType::operator+):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator+=):
(WTF::TimeWithDynamicClockType::operator-=):
(WTF::TimeWithDynamicClockType::operator==):
(WTF::TimeWithDynamicClockType::operator!=):
* wtf/WallTime.cpp: Added.
(WTF::WallTime::now):
(WTF::WallTime::approximateMonotonicTime):
(WTF::WallTime::dump):
(WTF::WallTime::sleep):
* wtf/WallTime.h: Added.
(WTF::WallTime::WallTime):
(WTF::WallTime::fromRawDouble):
(WTF::WallTime::infinity):
(WTF::WallTime::secondsSinceEpoch):
(WTF::WallTime::approximateWallTime):
(WTF::WallTime::operator bool):
(WTF::WallTime::operator+):
(WTF::WallTime::operator-):
(WTF::WallTime::operator+=):
(WTF::WallTime::operator-=):
(WTF::WallTime::operator==):
(WTF::WallTime::operator!=):
(WTF::WallTime::operator<):
(WTF::WallTime::operator>):
(WTF::WallTime::operator<=):
(WTF::WallTime::operator>=):
* wtf/threads/BinarySemaphore.cpp:
(WTF::BinarySemaphore::wait):
* wtf/threads/BinarySemaphore.h:
Tools:
* TestWebKitAPI/CMakeLists.txt:
* TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:
* TestWebKitAPI/Tests/WTF/Condition.cpp:
(TestWebKitAPI::TEST):
* TestWebKitAPI/Tests/WTF/SynchronizedFixedQueue.cpp:
(TestWebKitAPI::ToUpperConverter::stopProducing):
(TestWebKitAPI::ToUpperConverter::stopConsuming):
* TestWebKitAPI/Tests/WTF/Time.cpp: Added.
(WTF::operator<<):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/182152@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208415 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-05 03:02:39 +00:00
threadData - > parkingCondition . signal ( ) ;
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
}
JSC should support SharedArrayBuffer
https://bugs.webkit.org/show_bug.cgi?id=163986
Reviewed by Keith Miller.
JSTests:
This adds our own test for the various corner cases of SharedArrayBuffer. This test is meant to
check all of the things that don't require concurrency.
* stress/SharedArrayBuffer.js: Added.
(checkAtomics):
(shouldFail):
(Symbol):
(runAtomic):
Source/JavaScriptCore:
This implements https://tc39.github.io/ecmascript_sharedmem/shmem.html.
There is now a new SharedArrayBuffer type. In the JS runtime, which includes typed array
types, the SharedArrayBuffer is a drop-in replacement for ArrayBuffer, even though they are
distinct types (new SharedArrayBuffer() instanceof ArrayBuffer == false and vice versa). The
DOM will not recognize SharedArrayBuffer, or any typed array that wraps it, to ensure safety.
This matches what other browsers intend to do, see
https://github.com/tc39/ecmascript_sharedmem/issues/38. API is provided for the DOM to opt
into SharedArrayBuffer. One notable place is postMessage, which will share the
SharedArrayBuffer's underlying data storage with other workers. This creates a pool of shared
memory that the workers can use to talk to each other.
There is also an Atomics object in global scope, which exposes sequentially consistent atomic
operations: add, and, compareExchange, exchange, load, or, store, sub, and xor. Additionally
it exposes a Atomics.isLockFree utility, which takes a byte amount and returns true or false.
Also there is Atomics.wake/wait, which neatly map to ParkingLot.
Accesses to typed arrays that wrap SharedArrayBuffer are optimized by JSC the same way as
always. I believe that DFG and B3 already obey the following memory model, which I believe is
a bit weaker than Cambridge and a bit stronger than what is being proposed for
SharedArrayBuffer. To predict a program's behavior under the B3 memory model, imagine the
space of all possible programs that would result from running an optimizer that adversarially
follows B3's transformation rules. B3 transformations are correct if the newly created
program is equivalent to the old one, assuming that any opaque effect in IR (like the reads
and writes of a patchpoint/call/fence) could perform any load/store that satisfies the
B3::Effects summary. Opaque effects are a way of describing an infinite set of programs: any
program that only does the effects summarized in B3::Effects belongs to the set. For example,
this prevents motion of operations across fences since fences are summarized as opaque
effects that could read or write memory. This rule alone is not enough, because it leaves the
door open for turning an atomic operation (like a load) into a non-atomic one (like a load
followed by a store of the same value back to the same location or multiple loads). This is
not an optimization that either our compiler or the CPU would want to do. One way to think of
what exactly is forbidden is that B3 transformations that mess with memory accesses can only
reorder them or remove them. This means that for any execution of the untransformed program,
the corresponding execution of the transformed program (i.e. with the same input arguments
and the same programs filled in for the opaque effects) must have the same loads and stores,
with some removed and some reordered. This is a fairly simple mental model that B3 and DFG
already follow and it's based on existing abstractions for the infinite set of programs
inside an opaque effect (DFG's AbstractHeaps and B3's Effects).
This patch makes all atomics operations intrinsic, but the DFG doesn't know about any of them
yet. That's covered by bug 164108.
This ought to be perf-neutral, but I am still running tests to confirm this. I'm also still
writing new tests to cover all of the Atomics functionality and the behavior of SAB objects.
* API/JSTypedArray.cpp:
(JSObjectGetTypedArrayBytesPtr):
(JSObjectGetTypedArrayBuffer):
(JSObjectMakeArrayBufferWithBytesNoCopy):
* API/tests/CompareAndSwapTest.cpp:
(Bitmap::concurrentTestAndSet):
* CMakeLists.txt:
* JavaScriptCore.xcodeproj/project.pbxproj:
* dfg/DFGDesiredWatchpoints.cpp:
(JSC::DFG::ArrayBufferViewWatchpointAdaptor::add):
* heap/Heap.cpp:
(JSC::Heap::reportExtraMemoryVisited):
(JSC::Heap::reportExternalMemoryVisited):
* jsc.cpp:
(functionTransferArrayBuffer):
* runtime/ArrayBuffer.cpp:
(JSC::SharedArrayBufferContents::SharedArrayBufferContents):
(JSC::SharedArrayBufferContents::~SharedArrayBufferContents):
(JSC::ArrayBufferContents::ArrayBufferContents):
(JSC::ArrayBufferContents::operator=):
(JSC::ArrayBufferContents::~ArrayBufferContents):
(JSC::ArrayBufferContents::clear):
(JSC::ArrayBufferContents::destroy):
(JSC::ArrayBufferContents::reset):
(JSC::ArrayBufferContents::tryAllocate):
(JSC::ArrayBufferContents::makeShared):
(JSC::ArrayBufferContents::transferTo):
(JSC::ArrayBufferContents::copyTo):
(JSC::ArrayBufferContents::shareWith):
(JSC::ArrayBuffer::create):
(JSC::ArrayBuffer::createAdopted):
(JSC::ArrayBuffer::createFromBytes):
(JSC::ArrayBuffer::tryCreate):
(JSC::ArrayBuffer::createUninitialized):
(JSC::ArrayBuffer::tryCreateUninitialized):
(JSC::ArrayBuffer::createInternal):
(JSC::ArrayBuffer::ArrayBuffer):
(JSC::ArrayBuffer::slice):
(JSC::ArrayBuffer::sliceImpl):
(JSC::ArrayBuffer::makeShared):
(JSC::ArrayBuffer::setSharingMode):
(JSC::ArrayBuffer::transferTo):
(JSC::ArrayBuffer::transfer): Deleted.
* runtime/ArrayBuffer.h:
(JSC::arrayBufferSharingModeName):
(JSC::SharedArrayBufferContents::data):
(JSC::ArrayBufferContents::data):
(JSC::ArrayBufferContents::sizeInBytes):
(JSC::ArrayBufferContents::isShared):
(JSC::ArrayBuffer::sharingMode):
(JSC::ArrayBuffer::isShared):
(JSC::ArrayBuffer::gcSizeEstimateInBytes):
(JSC::arrayBufferDestructorNull): Deleted.
(JSC::arrayBufferDestructorDefault): Deleted.
(JSC::ArrayBufferContents::ArrayBufferContents): Deleted.
(JSC::ArrayBufferContents::transfer): Deleted.
(JSC::ArrayBufferContents::copyTo): Deleted.
(JSC::ArrayBuffer::create): Deleted.
(JSC::ArrayBuffer::createAdopted): Deleted.
(JSC::ArrayBuffer::createFromBytes): Deleted.
(JSC::ArrayBuffer::tryCreate): Deleted.
(JSC::ArrayBuffer::createUninitialized): Deleted.
(JSC::ArrayBuffer::tryCreateUninitialized): Deleted.
(JSC::ArrayBuffer::createInternal): Deleted.
(JSC::ArrayBuffer::ArrayBuffer): Deleted.
(JSC::ArrayBuffer::slice): Deleted.
(JSC::ArrayBuffer::sliceImpl): Deleted.
(JSC::ArrayBufferContents::tryAllocate): Deleted.
(JSC::ArrayBufferContents::~ArrayBufferContents): Deleted.
* runtime/ArrayBufferSharingMode.h: Added.
* runtime/ArrayBufferView.h:
(JSC::ArrayBufferView::possiblySharedBuffer):
(JSC::ArrayBufferView::unsharedBuffer):
(JSC::ArrayBufferView::isShared):
(JSC::ArrayBufferView::buffer): Deleted.
* runtime/AtomicsObject.cpp: Added.
(JSC::AtomicsObject::AtomicsObject):
(JSC::AtomicsObject::create):
(JSC::AtomicsObject::createStructure):
(JSC::AtomicsObject::finishCreation):
(JSC::atomicsFuncAdd):
(JSC::atomicsFuncAnd):
(JSC::atomicsFuncCompareExchange):
(JSC::atomicsFuncExchange):
(JSC::atomicsFuncIsLockFree):
(JSC::atomicsFuncLoad):
(JSC::atomicsFuncOr):
(JSC::atomicsFuncStore):
(JSC::atomicsFuncSub):
(JSC::atomicsFuncWait):
(JSC::atomicsFuncWake):
(JSC::atomicsFuncXor):
* runtime/AtomicsObject.h: Added.
* runtime/CommonIdentifiers.h:
* runtime/DataView.cpp:
(JSC::DataView::wrap):
* runtime/GenericTypedArrayViewInlines.h:
(JSC::GenericTypedArrayView<Adaptor>::subarray):
* runtime/Intrinsic.h:
* runtime/JSArrayBuffer.cpp:
(JSC::JSArrayBuffer::finishCreation):
(JSC::JSArrayBuffer::isShared):
(JSC::JSArrayBuffer::sharingMode):
* runtime/JSArrayBuffer.h:
(JSC::toPossiblySharedArrayBuffer):
(JSC::toUnsharedArrayBuffer):
(JSC::JSArrayBuffer::toWrapped):
(JSC::toArrayBuffer): Deleted.
* runtime/JSArrayBufferConstructor.cpp:
(JSC::JSArrayBufferConstructor::JSArrayBufferConstructor):
(JSC::JSArrayBufferConstructor::finishCreation):
(JSC::JSArrayBufferConstructor::create):
(JSC::constructArrayBuffer):
* runtime/JSArrayBufferConstructor.h:
(JSC::JSArrayBufferConstructor::sharingMode):
* runtime/JSArrayBufferPrototype.cpp:
(JSC::arrayBufferProtoFuncSlice):
(JSC::JSArrayBufferPrototype::JSArrayBufferPrototype):
(JSC::JSArrayBufferPrototype::finishCreation):
(JSC::JSArrayBufferPrototype::create):
* runtime/JSArrayBufferPrototype.h:
* runtime/JSArrayBufferView.cpp:
(JSC::JSArrayBufferView::finishCreation):
(JSC::JSArrayBufferView::visitChildren):
(JSC::JSArrayBufferView::unsharedBuffer):
(JSC::JSArrayBufferView::unsharedJSBuffer):
(JSC::JSArrayBufferView::possiblySharedJSBuffer):
(JSC::JSArrayBufferView::neuter):
(JSC::JSArrayBufferView::toWrapped): Deleted.
* runtime/JSArrayBufferView.h:
(JSC::JSArrayBufferView::jsBuffer): Deleted.
* runtime/JSArrayBufferViewInlines.h:
(JSC::JSArrayBufferView::isShared):
(JSC::JSArrayBufferView::possiblySharedBuffer):
(JSC::JSArrayBufferView::possiblySharedImpl):
(JSC::JSArrayBufferView::unsharedImpl):
(JSC::JSArrayBufferView::byteOffset):
(JSC::JSArrayBufferView::toWrapped):
(JSC::JSArrayBufferView::buffer): Deleted.
(JSC::JSArrayBufferView::impl): Deleted.
(JSC::JSArrayBufferView::neuter): Deleted.
* runtime/JSDataView.cpp:
(JSC::JSDataView::possiblySharedTypedImpl):
(JSC::JSDataView::unsharedTypedImpl):
(JSC::JSDataView::getTypedArrayImpl):
(JSC::JSDataView::typedImpl): Deleted.
* runtime/JSDataView.h:
(JSC::JSDataView::possiblySharedBuffer):
(JSC::JSDataView::unsharedBuffer):
(JSC::JSDataView::buffer): Deleted.
* runtime/JSDataViewPrototype.cpp:
(JSC::dataViewProtoGetterBuffer):
* runtime/JSGenericTypedArrayView.h:
(JSC::toPossiblySharedNativeTypedView):
(JSC::toUnsharedNativeTypedView):
(JSC::JSGenericTypedArrayView<Adaptor>::toWrapped):
(JSC::JSGenericTypedArrayView::typedImpl): Deleted.
(JSC::toNativeTypedView): Deleted.
* runtime/JSGenericTypedArrayViewInlines.h:
(JSC::JSGenericTypedArrayView<Adaptor>::create):
(JSC::JSGenericTypedArrayView<Adaptor>::possiblySharedTypedImpl):
(JSC::JSGenericTypedArrayView<Adaptor>::unsharedTypedImpl):
(JSC::JSGenericTypedArrayView<Adaptor>::getTypedArrayImpl):
* runtime/JSGenericTypedArrayViewPrototypeFunctions.h:
(JSC::genericTypedArrayViewProtoGetterFuncBuffer):
(JSC::genericTypedArrayViewPrivateFuncSubarrayCreate):
* runtime/JSGlobalObject.cpp:
(JSC::createAtomicsProperty):
(JSC::JSGlobalObject::init):
(JSC::JSGlobalObject::visitChildren):
* runtime/JSGlobalObject.h:
(JSC::JSGlobalObject::arrayBufferPrototype):
(JSC::JSGlobalObject::arrayBufferStructure):
* runtime/MathObject.cpp:
* runtime/RuntimeFlags.h:
* runtime/SimpleTypedArrayController.cpp:
(JSC::SimpleTypedArrayController::toJS):
* runtime/TypedArrayType.h:
(JSC::typedArrayTypeForType):
Source/WebCore:
New tests added in the LayoutTests/workers/sab directory.
This teaches WebCore that a typed array could be shared or not. By default, WebCore will
reject shared typed arrays as if they were not typed arrays. This ensures that we don't get
race conditions in code that can't handle it.
If you postMessage a SharedArrayBuffer or something that wraps it, you will send the shared
memory to the other worker.
* Modules/encryptedmedia/CDMSessionClearKey.cpp:
(WebCore::CDMSessionClearKey::cachedKeyForKeyID):
* Modules/fetch/FetchBody.cpp:
(WebCore::FetchBody::extract):
* Modules/mediastream/RTCDataChannel.cpp:
(WebCore::RTCDataChannel::send):
* Modules/webaudio/AudioBuffer.cpp:
(WebCore::AudioBuffer::getChannelData):
* Modules/websockets/WebSocket.cpp:
(WebCore::WebSocket::send):
* bindings/js/JSBlobCustom.cpp:
(WebCore::constructJSBlob):
* bindings/js/JSCryptoAlgorithmDictionary.cpp:
(WebCore::createRsaKeyGenParams):
* bindings/js/JSCryptoCustom.cpp:
(WebCore::JSCrypto::getRandomValues):
* bindings/js/JSCryptoOperationData.cpp:
(WebCore::cryptoOperationDataFromJSValue):
* bindings/js/JSDOMBinding.h:
(WebCore::toJS):
(WebCore::toPossiblySharedArrayBufferView):
(WebCore::toUnsharedArrayBufferView):
(WebCore::toPossiblySharedInt8Array):
(WebCore::toPossiblySharedInt16Array):
(WebCore::toPossiblySharedInt32Array):
(WebCore::toPossiblySharedUint8Array):
(WebCore::toPossiblySharedUint8ClampedArray):
(WebCore::toPossiblySharedUint16Array):
(WebCore::toPossiblySharedUint32Array):
(WebCore::toPossiblySharedFloat32Array):
(WebCore::toPossiblySharedFloat64Array):
(WebCore::toUnsharedInt8Array):
(WebCore::toUnsharedInt16Array):
(WebCore::toUnsharedInt32Array):
(WebCore::toUnsharedUint8Array):
(WebCore::toUnsharedUint8ClampedArray):
(WebCore::toUnsharedUint16Array):
(WebCore::toUnsharedUint32Array):
(WebCore::toUnsharedFloat32Array):
(WebCore::toUnsharedFloat64Array):
(WebCore::toArrayBufferView): Deleted.
(WebCore::toInt8Array): Deleted.
(WebCore::toInt16Array): Deleted.
(WebCore::toInt32Array): Deleted.
(WebCore::toUint8Array): Deleted.
(WebCore::toUint8ClampedArray): Deleted.
(WebCore::toUint16Array): Deleted.
(WebCore::toUint32Array): Deleted.
(WebCore::toFloat32Array): Deleted.
(WebCore::toFloat64Array): Deleted.
* bindings/js/JSDataCueCustom.cpp:
(WebCore::constructJSDataCue):
* bindings/js/JSDictionary.cpp:
(WebCore::JSDictionary::convertValue):
* bindings/js/JSFileCustom.cpp:
(WebCore::constructJSFile):
* bindings/js/JSMessagePortCustom.cpp:
(WebCore::extractTransferables):
* bindings/js/JSWebGLRenderingContextBaseCustom.cpp:
(WebCore::dataFunctionf):
(WebCore::dataFunctioni):
(WebCore::dataFunctionMatrix):
* bindings/js/JSXMLHttpRequestCustom.cpp:
(WebCore::JSXMLHttpRequest::send):
* bindings/js/SerializedScriptValue.cpp:
(WebCore::CloneSerializer::dumpArrayBufferView):
(WebCore::CloneSerializer::dumpIfTerminal):
(WebCore::CloneDeserializer::readArrayBufferView):
(WebCore::CloneDeserializer::readTerminal):
(WebCore::SerializedScriptValue::transferArrayBuffers):
* bindings/js/StructuredClone.cpp:
(WebCore::structuredCloneArrayBuffer):
(WebCore::structuredCloneArrayBufferView):
* bindings/scripts/CodeGeneratorJS.pm:
(JSValueToNative):
* css/FontFace.cpp:
(WebCore::FontFace::create):
* html/canvas/WebGL2RenderingContext.cpp:
(WebCore::WebGL2RenderingContext::bufferData):
(WebCore::WebGL2RenderingContext::bufferSubData):
* platform/graphics/avfoundation/MediaPlayerPrivateAVFoundation.cpp:
(WebCore::MediaPlayerPrivateAVFoundation::extractKeyURIKeyIDAndCertificateFromInitData):
Source/WebKit/mac:
Support the RuntimeFlag.
* WebView/WebPreferencesPrivate.h:
Source/WebKit/win:
Support the RuntimeFlag.
* Interfaces/IWebPreferencesPrivate.idl:
Source/WebKit2:
Adds some small things we need for SharedArrayBuffer.
* UIProcess/API/C/WKPreferencesRefPrivate.h:
* UIProcess/API/Cocoa/WKPreferencesPrivate.h:
* WebProcess/InjectedBundle/InjectedBundle.cpp:
(WebKit::InjectedBundle::createWebDataFromUint8Array):
Source/WTF:
Adds some small things we need for SharedArrayBuffer.
* wtf/Atomics.h:
(WTF::Atomic::compareExchangeWeakRelaxed):
(WTF::Atomic::exchangeAdd):
(WTF::Atomic::exchangeAnd):
(WTF::Atomic::exchangeOr):
(WTF::Atomic::exchangeSub):
(WTF::Atomic::exchangeXor):
(WTF::atomicLoad):
(WTF::atomicStore):
(WTF::atomicCompareExchangeWeak):
(WTF::atomicCompareExchangeWeakRelaxed):
(WTF::atomicCompareExchangeStrong):
(WTF::atomicExchangeAdd):
(WTF::atomicExchangeAnd):
(WTF::atomicExchangeOr):
(WTF::atomicExchangeSub):
(WTF::atomicExchangeXor):
(WTF::atomicExchange):
(WTF::Atomic::exchangeAndAdd): Deleted.
(WTF::weakCompareAndSwap): Deleted.
We need to be able to do atomics operations on naked pointers. We also need to be able to do
all of the things that std::atomic does. This adds those things and renames
weakCompareAndSwap to atomicCompareExchangeWeakRelaxed so that we're using consistent
terminology.
* wtf/Bitmap.h:
(WTF::WordType>::concurrentTestAndSet): Renamed weakCompareAndSwap.
(WTF::WordType>::concurrentTestAndClear): Renamed weakCompareAndSwap.
* wtf/FastBitVector.h:
(WTF::FastBitVector::atomicSetAndCheck): Renamed weakCompareAndSwap.
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkCount):
* wtf/ParkingLot.h:
Added unparkCount(), which lets you unpark some bounded number of threads and returns the
number of threads unparked. This is just a modest extension of unparkAll(). unparkAll() now
just calls unparkCount(ptr, UINT_MAX).
Tools:
Use the right kind of typed array API.
* DumpRenderTree/TestRunner.cpp:
(setAudioResultCallback):
LayoutTests:
Adding tests. This is a work in progress.
* workers/sab: Added.
* workers/sab/simple-worker-1.js: Added.
(onmessage):
* workers/sab/simple-worker-2.js: Added.
(onmessage):
* workers/sab/simple.html: Added.
Canonical link: https://commits.webkit.org/181984@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208209 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-01 03:10:00 +00:00
NEVER_INLINE unsigned ParkingLot : : unparkCount ( const void * address , unsigned count )
2015-08-11 19:51:35 +00:00
{
JSC should support SharedArrayBuffer
https://bugs.webkit.org/show_bug.cgi?id=163986
Reviewed by Keith Miller.
JSTests:
This adds our own test for the various corner cases of SharedArrayBuffer. This test is meant to
check all of the things that don't require concurrency.
* stress/SharedArrayBuffer.js: Added.
(checkAtomics):
(shouldFail):
(Symbol):
(runAtomic):
Source/JavaScriptCore:
This implements https://tc39.github.io/ecmascript_sharedmem/shmem.html.
There is now a new SharedArrayBuffer type. In the JS runtime, which includes typed array
types, the SharedArrayBuffer is a drop-in replacement for ArrayBuffer, even though they are
distinct types (new SharedArrayBuffer() instanceof ArrayBuffer == false and vice versa). The
DOM will not recognize SharedArrayBuffer, or any typed array that wraps it, to ensure safety.
This matches what other browsers intend to do, see
https://github.com/tc39/ecmascript_sharedmem/issues/38. API is provided for the DOM to opt
into SharedArrayBuffer. One notable place is postMessage, which will share the
SharedArrayBuffer's underlying data storage with other workers. This creates a pool of shared
memory that the workers can use to talk to each other.
There is also an Atomics object in global scope, which exposes sequentially consistent atomic
operations: add, and, compareExchange, exchange, load, or, store, sub, and xor. Additionally
it exposes a Atomics.isLockFree utility, which takes a byte amount and returns true or false.
Also there is Atomics.wake/wait, which neatly map to ParkingLot.
Accesses to typed arrays that wrap SharedArrayBuffer are optimized by JSC the same way as
always. I believe that DFG and B3 already obey the following memory model, which I believe is
a bit weaker than Cambridge and a bit stronger than what is being proposed for
SharedArrayBuffer. To predict a program's behavior under the B3 memory model, imagine the
space of all possible programs that would result from running an optimizer that adversarially
follows B3's transformation rules. B3 transformations are correct if the newly created
program is equivalent to the old one, assuming that any opaque effect in IR (like the reads
and writes of a patchpoint/call/fence) could perform any load/store that satisfies the
B3::Effects summary. Opaque effects are a way of describing an infinite set of programs: any
program that only does the effects summarized in B3::Effects belongs to the set. For example,
this prevents motion of operations across fences since fences are summarized as opaque
effects that could read or write memory. This rule alone is not enough, because it leaves the
door open for turning an atomic operation (like a load) into a non-atomic one (like a load
followed by a store of the same value back to the same location or multiple loads). This is
not an optimization that either our compiler or the CPU would want to do. One way to think of
what exactly is forbidden is that B3 transformations that mess with memory accesses can only
reorder them or remove them. This means that for any execution of the untransformed program,
the corresponding execution of the transformed program (i.e. with the same input arguments
and the same programs filled in for the opaque effects) must have the same loads and stores,
with some removed and some reordered. This is a fairly simple mental model that B3 and DFG
already follow and it's based on existing abstractions for the infinite set of programs
inside an opaque effect (DFG's AbstractHeaps and B3's Effects).
This patch makes all atomics operations intrinsic, but the DFG doesn't know about any of them
yet. That's covered by bug 164108.
This ought to be perf-neutral, but I am still running tests to confirm this. I'm also still
writing new tests to cover all of the Atomics functionality and the behavior of SAB objects.
* API/JSTypedArray.cpp:
(JSObjectGetTypedArrayBytesPtr):
(JSObjectGetTypedArrayBuffer):
(JSObjectMakeArrayBufferWithBytesNoCopy):
* API/tests/CompareAndSwapTest.cpp:
(Bitmap::concurrentTestAndSet):
* CMakeLists.txt:
* JavaScriptCore.xcodeproj/project.pbxproj:
* dfg/DFGDesiredWatchpoints.cpp:
(JSC::DFG::ArrayBufferViewWatchpointAdaptor::add):
* heap/Heap.cpp:
(JSC::Heap::reportExtraMemoryVisited):
(JSC::Heap::reportExternalMemoryVisited):
* jsc.cpp:
(functionTransferArrayBuffer):
* runtime/ArrayBuffer.cpp:
(JSC::SharedArrayBufferContents::SharedArrayBufferContents):
(JSC::SharedArrayBufferContents::~SharedArrayBufferContents):
(JSC::ArrayBufferContents::ArrayBufferContents):
(JSC::ArrayBufferContents::operator=):
(JSC::ArrayBufferContents::~ArrayBufferContents):
(JSC::ArrayBufferContents::clear):
(JSC::ArrayBufferContents::destroy):
(JSC::ArrayBufferContents::reset):
(JSC::ArrayBufferContents::tryAllocate):
(JSC::ArrayBufferContents::makeShared):
(JSC::ArrayBufferContents::transferTo):
(JSC::ArrayBufferContents::copyTo):
(JSC::ArrayBufferContents::shareWith):
(JSC::ArrayBuffer::create):
(JSC::ArrayBuffer::createAdopted):
(JSC::ArrayBuffer::createFromBytes):
(JSC::ArrayBuffer::tryCreate):
(JSC::ArrayBuffer::createUninitialized):
(JSC::ArrayBuffer::tryCreateUninitialized):
(JSC::ArrayBuffer::createInternal):
(JSC::ArrayBuffer::ArrayBuffer):
(JSC::ArrayBuffer::slice):
(JSC::ArrayBuffer::sliceImpl):
(JSC::ArrayBuffer::makeShared):
(JSC::ArrayBuffer::setSharingMode):
(JSC::ArrayBuffer::transferTo):
(JSC::ArrayBuffer::transfer): Deleted.
* runtime/ArrayBuffer.h:
(JSC::arrayBufferSharingModeName):
(JSC::SharedArrayBufferContents::data):
(JSC::ArrayBufferContents::data):
(JSC::ArrayBufferContents::sizeInBytes):
(JSC::ArrayBufferContents::isShared):
(JSC::ArrayBuffer::sharingMode):
(JSC::ArrayBuffer::isShared):
(JSC::ArrayBuffer::gcSizeEstimateInBytes):
(JSC::arrayBufferDestructorNull): Deleted.
(JSC::arrayBufferDestructorDefault): Deleted.
(JSC::ArrayBufferContents::ArrayBufferContents): Deleted.
(JSC::ArrayBufferContents::transfer): Deleted.
(JSC::ArrayBufferContents::copyTo): Deleted.
(JSC::ArrayBuffer::create): Deleted.
(JSC::ArrayBuffer::createAdopted): Deleted.
(JSC::ArrayBuffer::createFromBytes): Deleted.
(JSC::ArrayBuffer::tryCreate): Deleted.
(JSC::ArrayBuffer::createUninitialized): Deleted.
(JSC::ArrayBuffer::tryCreateUninitialized): Deleted.
(JSC::ArrayBuffer::createInternal): Deleted.
(JSC::ArrayBuffer::ArrayBuffer): Deleted.
(JSC::ArrayBuffer::slice): Deleted.
(JSC::ArrayBuffer::sliceImpl): Deleted.
(JSC::ArrayBufferContents::tryAllocate): Deleted.
(JSC::ArrayBufferContents::~ArrayBufferContents): Deleted.
* runtime/ArrayBufferSharingMode.h: Added.
* runtime/ArrayBufferView.h:
(JSC::ArrayBufferView::possiblySharedBuffer):
(JSC::ArrayBufferView::unsharedBuffer):
(JSC::ArrayBufferView::isShared):
(JSC::ArrayBufferView::buffer): Deleted.
* runtime/AtomicsObject.cpp: Added.
(JSC::AtomicsObject::AtomicsObject):
(JSC::AtomicsObject::create):
(JSC::AtomicsObject::createStructure):
(JSC::AtomicsObject::finishCreation):
(JSC::atomicsFuncAdd):
(JSC::atomicsFuncAnd):
(JSC::atomicsFuncCompareExchange):
(JSC::atomicsFuncExchange):
(JSC::atomicsFuncIsLockFree):
(JSC::atomicsFuncLoad):
(JSC::atomicsFuncOr):
(JSC::atomicsFuncStore):
(JSC::atomicsFuncSub):
(JSC::atomicsFuncWait):
(JSC::atomicsFuncWake):
(JSC::atomicsFuncXor):
* runtime/AtomicsObject.h: Added.
* runtime/CommonIdentifiers.h:
* runtime/DataView.cpp:
(JSC::DataView::wrap):
* runtime/GenericTypedArrayViewInlines.h:
(JSC::GenericTypedArrayView<Adaptor>::subarray):
* runtime/Intrinsic.h:
* runtime/JSArrayBuffer.cpp:
(JSC::JSArrayBuffer::finishCreation):
(JSC::JSArrayBuffer::isShared):
(JSC::JSArrayBuffer::sharingMode):
* runtime/JSArrayBuffer.h:
(JSC::toPossiblySharedArrayBuffer):
(JSC::toUnsharedArrayBuffer):
(JSC::JSArrayBuffer::toWrapped):
(JSC::toArrayBuffer): Deleted.
* runtime/JSArrayBufferConstructor.cpp:
(JSC::JSArrayBufferConstructor::JSArrayBufferConstructor):
(JSC::JSArrayBufferConstructor::finishCreation):
(JSC::JSArrayBufferConstructor::create):
(JSC::constructArrayBuffer):
* runtime/JSArrayBufferConstructor.h:
(JSC::JSArrayBufferConstructor::sharingMode):
* runtime/JSArrayBufferPrototype.cpp:
(JSC::arrayBufferProtoFuncSlice):
(JSC::JSArrayBufferPrototype::JSArrayBufferPrototype):
(JSC::JSArrayBufferPrototype::finishCreation):
(JSC::JSArrayBufferPrototype::create):
* runtime/JSArrayBufferPrototype.h:
* runtime/JSArrayBufferView.cpp:
(JSC::JSArrayBufferView::finishCreation):
(JSC::JSArrayBufferView::visitChildren):
(JSC::JSArrayBufferView::unsharedBuffer):
(JSC::JSArrayBufferView::unsharedJSBuffer):
(JSC::JSArrayBufferView::possiblySharedJSBuffer):
(JSC::JSArrayBufferView::neuter):
(JSC::JSArrayBufferView::toWrapped): Deleted.
* runtime/JSArrayBufferView.h:
(JSC::JSArrayBufferView::jsBuffer): Deleted.
* runtime/JSArrayBufferViewInlines.h:
(JSC::JSArrayBufferView::isShared):
(JSC::JSArrayBufferView::possiblySharedBuffer):
(JSC::JSArrayBufferView::possiblySharedImpl):
(JSC::JSArrayBufferView::unsharedImpl):
(JSC::JSArrayBufferView::byteOffset):
(JSC::JSArrayBufferView::toWrapped):
(JSC::JSArrayBufferView::buffer): Deleted.
(JSC::JSArrayBufferView::impl): Deleted.
(JSC::JSArrayBufferView::neuter): Deleted.
* runtime/JSDataView.cpp:
(JSC::JSDataView::possiblySharedTypedImpl):
(JSC::JSDataView::unsharedTypedImpl):
(JSC::JSDataView::getTypedArrayImpl):
(JSC::JSDataView::typedImpl): Deleted.
* runtime/JSDataView.h:
(JSC::JSDataView::possiblySharedBuffer):
(JSC::JSDataView::unsharedBuffer):
(JSC::JSDataView::buffer): Deleted.
* runtime/JSDataViewPrototype.cpp:
(JSC::dataViewProtoGetterBuffer):
* runtime/JSGenericTypedArrayView.h:
(JSC::toPossiblySharedNativeTypedView):
(JSC::toUnsharedNativeTypedView):
(JSC::JSGenericTypedArrayView<Adaptor>::toWrapped):
(JSC::JSGenericTypedArrayView::typedImpl): Deleted.
(JSC::toNativeTypedView): Deleted.
* runtime/JSGenericTypedArrayViewInlines.h:
(JSC::JSGenericTypedArrayView<Adaptor>::create):
(JSC::JSGenericTypedArrayView<Adaptor>::possiblySharedTypedImpl):
(JSC::JSGenericTypedArrayView<Adaptor>::unsharedTypedImpl):
(JSC::JSGenericTypedArrayView<Adaptor>::getTypedArrayImpl):
* runtime/JSGenericTypedArrayViewPrototypeFunctions.h:
(JSC::genericTypedArrayViewProtoGetterFuncBuffer):
(JSC::genericTypedArrayViewPrivateFuncSubarrayCreate):
* runtime/JSGlobalObject.cpp:
(JSC::createAtomicsProperty):
(JSC::JSGlobalObject::init):
(JSC::JSGlobalObject::visitChildren):
* runtime/JSGlobalObject.h:
(JSC::JSGlobalObject::arrayBufferPrototype):
(JSC::JSGlobalObject::arrayBufferStructure):
* runtime/MathObject.cpp:
* runtime/RuntimeFlags.h:
* runtime/SimpleTypedArrayController.cpp:
(JSC::SimpleTypedArrayController::toJS):
* runtime/TypedArrayType.h:
(JSC::typedArrayTypeForType):
Source/WebCore:
New tests added in the LayoutTests/workers/sab directory.
This teaches WebCore that a typed array could be shared or not. By default, WebCore will
reject shared typed arrays as if they were not typed arrays. This ensures that we don't get
race conditions in code that can't handle it.
If you postMessage a SharedArrayBuffer or something that wraps it, you will send the shared
memory to the other worker.
* Modules/encryptedmedia/CDMSessionClearKey.cpp:
(WebCore::CDMSessionClearKey::cachedKeyForKeyID):
* Modules/fetch/FetchBody.cpp:
(WebCore::FetchBody::extract):
* Modules/mediastream/RTCDataChannel.cpp:
(WebCore::RTCDataChannel::send):
* Modules/webaudio/AudioBuffer.cpp:
(WebCore::AudioBuffer::getChannelData):
* Modules/websockets/WebSocket.cpp:
(WebCore::WebSocket::send):
* bindings/js/JSBlobCustom.cpp:
(WebCore::constructJSBlob):
* bindings/js/JSCryptoAlgorithmDictionary.cpp:
(WebCore::createRsaKeyGenParams):
* bindings/js/JSCryptoCustom.cpp:
(WebCore::JSCrypto::getRandomValues):
* bindings/js/JSCryptoOperationData.cpp:
(WebCore::cryptoOperationDataFromJSValue):
* bindings/js/JSDOMBinding.h:
(WebCore::toJS):
(WebCore::toPossiblySharedArrayBufferView):
(WebCore::toUnsharedArrayBufferView):
(WebCore::toPossiblySharedInt8Array):
(WebCore::toPossiblySharedInt16Array):
(WebCore::toPossiblySharedInt32Array):
(WebCore::toPossiblySharedUint8Array):
(WebCore::toPossiblySharedUint8ClampedArray):
(WebCore::toPossiblySharedUint16Array):
(WebCore::toPossiblySharedUint32Array):
(WebCore::toPossiblySharedFloat32Array):
(WebCore::toPossiblySharedFloat64Array):
(WebCore::toUnsharedInt8Array):
(WebCore::toUnsharedInt16Array):
(WebCore::toUnsharedInt32Array):
(WebCore::toUnsharedUint8Array):
(WebCore::toUnsharedUint8ClampedArray):
(WebCore::toUnsharedUint16Array):
(WebCore::toUnsharedUint32Array):
(WebCore::toUnsharedFloat32Array):
(WebCore::toUnsharedFloat64Array):
(WebCore::toArrayBufferView): Deleted.
(WebCore::toInt8Array): Deleted.
(WebCore::toInt16Array): Deleted.
(WebCore::toInt32Array): Deleted.
(WebCore::toUint8Array): Deleted.
(WebCore::toUint8ClampedArray): Deleted.
(WebCore::toUint16Array): Deleted.
(WebCore::toUint32Array): Deleted.
(WebCore::toFloat32Array): Deleted.
(WebCore::toFloat64Array): Deleted.
* bindings/js/JSDataCueCustom.cpp:
(WebCore::constructJSDataCue):
* bindings/js/JSDictionary.cpp:
(WebCore::JSDictionary::convertValue):
* bindings/js/JSFileCustom.cpp:
(WebCore::constructJSFile):
* bindings/js/JSMessagePortCustom.cpp:
(WebCore::extractTransferables):
* bindings/js/JSWebGLRenderingContextBaseCustom.cpp:
(WebCore::dataFunctionf):
(WebCore::dataFunctioni):
(WebCore::dataFunctionMatrix):
* bindings/js/JSXMLHttpRequestCustom.cpp:
(WebCore::JSXMLHttpRequest::send):
* bindings/js/SerializedScriptValue.cpp:
(WebCore::CloneSerializer::dumpArrayBufferView):
(WebCore::CloneSerializer::dumpIfTerminal):
(WebCore::CloneDeserializer::readArrayBufferView):
(WebCore::CloneDeserializer::readTerminal):
(WebCore::SerializedScriptValue::transferArrayBuffers):
* bindings/js/StructuredClone.cpp:
(WebCore::structuredCloneArrayBuffer):
(WebCore::structuredCloneArrayBufferView):
* bindings/scripts/CodeGeneratorJS.pm:
(JSValueToNative):
* css/FontFace.cpp:
(WebCore::FontFace::create):
* html/canvas/WebGL2RenderingContext.cpp:
(WebCore::WebGL2RenderingContext::bufferData):
(WebCore::WebGL2RenderingContext::bufferSubData):
* platform/graphics/avfoundation/MediaPlayerPrivateAVFoundation.cpp:
(WebCore::MediaPlayerPrivateAVFoundation::extractKeyURIKeyIDAndCertificateFromInitData):
Source/WebKit/mac:
Support the RuntimeFlag.
* WebView/WebPreferencesPrivate.h:
Source/WebKit/win:
Support the RuntimeFlag.
* Interfaces/IWebPreferencesPrivate.idl:
Source/WebKit2:
Adds some small things we need for SharedArrayBuffer.
* UIProcess/API/C/WKPreferencesRefPrivate.h:
* UIProcess/API/Cocoa/WKPreferencesPrivate.h:
* WebProcess/InjectedBundle/InjectedBundle.cpp:
(WebKit::InjectedBundle::createWebDataFromUint8Array):
Source/WTF:
Adds some small things we need for SharedArrayBuffer.
* wtf/Atomics.h:
(WTF::Atomic::compareExchangeWeakRelaxed):
(WTF::Atomic::exchangeAdd):
(WTF::Atomic::exchangeAnd):
(WTF::Atomic::exchangeOr):
(WTF::Atomic::exchangeSub):
(WTF::Atomic::exchangeXor):
(WTF::atomicLoad):
(WTF::atomicStore):
(WTF::atomicCompareExchangeWeak):
(WTF::atomicCompareExchangeWeakRelaxed):
(WTF::atomicCompareExchangeStrong):
(WTF::atomicExchangeAdd):
(WTF::atomicExchangeAnd):
(WTF::atomicExchangeOr):
(WTF::atomicExchangeSub):
(WTF::atomicExchangeXor):
(WTF::atomicExchange):
(WTF::Atomic::exchangeAndAdd): Deleted.
(WTF::weakCompareAndSwap): Deleted.
We need to be able to do atomics operations on naked pointers. We also need to be able to do
all of the things that std::atomic does. This adds those things and renames
weakCompareAndSwap to atomicCompareExchangeWeakRelaxed so that we're using consistent
terminology.
* wtf/Bitmap.h:
(WTF::WordType>::concurrentTestAndSet): Renamed weakCompareAndSwap.
(WTF::WordType>::concurrentTestAndClear): Renamed weakCompareAndSwap.
* wtf/FastBitVector.h:
(WTF::FastBitVector::atomicSetAndCheck): Renamed weakCompareAndSwap.
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkCount):
* wtf/ParkingLot.h:
Added unparkCount(), which lets you unpark some bounded number of threads and returns the
number of threads unparked. This is just a modest extension of unparkAll(). unparkAll() now
just calls unparkCount(ptr, UINT_MAX).
Tools:
Use the right kind of typed array API.
* DumpRenderTree/TestRunner.cpp:
(setAudioResultCallback):
LayoutTests:
Adding tests. This is a work in progress.
* workers/sab: Added.
* workers/sab/simple-worker-1.js: Added.
(onmessage):
* workers/sab/simple-worker-2.js: Added.
(onmessage):
* workers/sab/simple.html: Added.
Canonical link: https://commits.webkit.org/181984@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208209 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-01 03:10:00 +00:00
if ( ! count )
return 0 ;
2015-08-11 19:51:35 +00:00
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : unparking count = " , count , " from " , RawPointer ( address ) , " . \n " ) ) ;
2015-08-11 19:51:35 +00:00
2016-09-13 15:59:25 +00:00
Vector < RefPtr < ThreadData > , 8 > threadDatas ;
2015-08-11 19:51:35 +00:00
dequeue (
address ,
JSC should support SharedArrayBuffer
https://bugs.webkit.org/show_bug.cgi?id=163986
Reviewed by Keith Miller.
JSTests:
This adds our own test for the various corner cases of SharedArrayBuffer. This test is meant to
check all of the things that don't require concurrency.
* stress/SharedArrayBuffer.js: Added.
(checkAtomics):
(shouldFail):
(Symbol):
(runAtomic):
Source/JavaScriptCore:
This implements https://tc39.github.io/ecmascript_sharedmem/shmem.html.
There is now a new SharedArrayBuffer type. In the JS runtime, which includes typed array
types, the SharedArrayBuffer is a drop-in replacement for ArrayBuffer, even though they are
distinct types (new SharedArrayBuffer() instanceof ArrayBuffer == false and vice versa). The
DOM will not recognize SharedArrayBuffer, or any typed array that wraps it, to ensure safety.
This matches what other browsers intend to do, see
https://github.com/tc39/ecmascript_sharedmem/issues/38. API is provided for the DOM to opt
into SharedArrayBuffer. One notable place is postMessage, which will share the
SharedArrayBuffer's underlying data storage with other workers. This creates a pool of shared
memory that the workers can use to talk to each other.
There is also an Atomics object in global scope, which exposes sequentially consistent atomic
operations: add, and, compareExchange, exchange, load, or, store, sub, and xor. Additionally
it exposes a Atomics.isLockFree utility, which takes a byte amount and returns true or false.
Also there is Atomics.wake/wait, which neatly map to ParkingLot.
Accesses to typed arrays that wrap SharedArrayBuffer are optimized by JSC the same way as
always. I believe that DFG and B3 already obey the following memory model, which I believe is
a bit weaker than Cambridge and a bit stronger than what is being proposed for
SharedArrayBuffer. To predict a program's behavior under the B3 memory model, imagine the
space of all possible programs that would result from running an optimizer that adversarially
follows B3's transformation rules. B3 transformations are correct if the newly created
program is equivalent to the old one, assuming that any opaque effect in IR (like the reads
and writes of a patchpoint/call/fence) could perform any load/store that satisfies the
B3::Effects summary. Opaque effects are a way of describing an infinite set of programs: any
program that only does the effects summarized in B3::Effects belongs to the set. For example,
this prevents motion of operations across fences since fences are summarized as opaque
effects that could read or write memory. This rule alone is not enough, because it leaves the
door open for turning an atomic operation (like a load) into a non-atomic one (like a load
followed by a store of the same value back to the same location or multiple loads). This is
not an optimization that either our compiler or the CPU would want to do. One way to think of
what exactly is forbidden is that B3 transformations that mess with memory accesses can only
reorder them or remove them. This means that for any execution of the untransformed program,
the corresponding execution of the transformed program (i.e. with the same input arguments
and the same programs filled in for the opaque effects) must have the same loads and stores,
with some removed and some reordered. This is a fairly simple mental model that B3 and DFG
already follow and it's based on existing abstractions for the infinite set of programs
inside an opaque effect (DFG's AbstractHeaps and B3's Effects).
This patch makes all atomics operations intrinsic, but the DFG doesn't know about any of them
yet. That's covered by bug 164108.
This ought to be perf-neutral, but I am still running tests to confirm this. I'm also still
writing new tests to cover all of the Atomics functionality and the behavior of SAB objects.
* API/JSTypedArray.cpp:
(JSObjectGetTypedArrayBytesPtr):
(JSObjectGetTypedArrayBuffer):
(JSObjectMakeArrayBufferWithBytesNoCopy):
* API/tests/CompareAndSwapTest.cpp:
(Bitmap::concurrentTestAndSet):
* CMakeLists.txt:
* JavaScriptCore.xcodeproj/project.pbxproj:
* dfg/DFGDesiredWatchpoints.cpp:
(JSC::DFG::ArrayBufferViewWatchpointAdaptor::add):
* heap/Heap.cpp:
(JSC::Heap::reportExtraMemoryVisited):
(JSC::Heap::reportExternalMemoryVisited):
* jsc.cpp:
(functionTransferArrayBuffer):
* runtime/ArrayBuffer.cpp:
(JSC::SharedArrayBufferContents::SharedArrayBufferContents):
(JSC::SharedArrayBufferContents::~SharedArrayBufferContents):
(JSC::ArrayBufferContents::ArrayBufferContents):
(JSC::ArrayBufferContents::operator=):
(JSC::ArrayBufferContents::~ArrayBufferContents):
(JSC::ArrayBufferContents::clear):
(JSC::ArrayBufferContents::destroy):
(JSC::ArrayBufferContents::reset):
(JSC::ArrayBufferContents::tryAllocate):
(JSC::ArrayBufferContents::makeShared):
(JSC::ArrayBufferContents::transferTo):
(JSC::ArrayBufferContents::copyTo):
(JSC::ArrayBufferContents::shareWith):
(JSC::ArrayBuffer::create):
(JSC::ArrayBuffer::createAdopted):
(JSC::ArrayBuffer::createFromBytes):
(JSC::ArrayBuffer::tryCreate):
(JSC::ArrayBuffer::createUninitialized):
(JSC::ArrayBuffer::tryCreateUninitialized):
(JSC::ArrayBuffer::createInternal):
(JSC::ArrayBuffer::ArrayBuffer):
(JSC::ArrayBuffer::slice):
(JSC::ArrayBuffer::sliceImpl):
(JSC::ArrayBuffer::makeShared):
(JSC::ArrayBuffer::setSharingMode):
(JSC::ArrayBuffer::transferTo):
(JSC::ArrayBuffer::transfer): Deleted.
* runtime/ArrayBuffer.h:
(JSC::arrayBufferSharingModeName):
(JSC::SharedArrayBufferContents::data):
(JSC::ArrayBufferContents::data):
(JSC::ArrayBufferContents::sizeInBytes):
(JSC::ArrayBufferContents::isShared):
(JSC::ArrayBuffer::sharingMode):
(JSC::ArrayBuffer::isShared):
(JSC::ArrayBuffer::gcSizeEstimateInBytes):
(JSC::arrayBufferDestructorNull): Deleted.
(JSC::arrayBufferDestructorDefault): Deleted.
(JSC::ArrayBufferContents::ArrayBufferContents): Deleted.
(JSC::ArrayBufferContents::transfer): Deleted.
(JSC::ArrayBufferContents::copyTo): Deleted.
(JSC::ArrayBuffer::create): Deleted.
(JSC::ArrayBuffer::createAdopted): Deleted.
(JSC::ArrayBuffer::createFromBytes): Deleted.
(JSC::ArrayBuffer::tryCreate): Deleted.
(JSC::ArrayBuffer::createUninitialized): Deleted.
(JSC::ArrayBuffer::tryCreateUninitialized): Deleted.
(JSC::ArrayBuffer::createInternal): Deleted.
(JSC::ArrayBuffer::ArrayBuffer): Deleted.
(JSC::ArrayBuffer::slice): Deleted.
(JSC::ArrayBuffer::sliceImpl): Deleted.
(JSC::ArrayBufferContents::tryAllocate): Deleted.
(JSC::ArrayBufferContents::~ArrayBufferContents): Deleted.
* runtime/ArrayBufferSharingMode.h: Added.
* runtime/ArrayBufferView.h:
(JSC::ArrayBufferView::possiblySharedBuffer):
(JSC::ArrayBufferView::unsharedBuffer):
(JSC::ArrayBufferView::isShared):
(JSC::ArrayBufferView::buffer): Deleted.
* runtime/AtomicsObject.cpp: Added.
(JSC::AtomicsObject::AtomicsObject):
(JSC::AtomicsObject::create):
(JSC::AtomicsObject::createStructure):
(JSC::AtomicsObject::finishCreation):
(JSC::atomicsFuncAdd):
(JSC::atomicsFuncAnd):
(JSC::atomicsFuncCompareExchange):
(JSC::atomicsFuncExchange):
(JSC::atomicsFuncIsLockFree):
(JSC::atomicsFuncLoad):
(JSC::atomicsFuncOr):
(JSC::atomicsFuncStore):
(JSC::atomicsFuncSub):
(JSC::atomicsFuncWait):
(JSC::atomicsFuncWake):
(JSC::atomicsFuncXor):
* runtime/AtomicsObject.h: Added.
* runtime/CommonIdentifiers.h:
* runtime/DataView.cpp:
(JSC::DataView::wrap):
* runtime/GenericTypedArrayViewInlines.h:
(JSC::GenericTypedArrayView<Adaptor>::subarray):
* runtime/Intrinsic.h:
* runtime/JSArrayBuffer.cpp:
(JSC::JSArrayBuffer::finishCreation):
(JSC::JSArrayBuffer::isShared):
(JSC::JSArrayBuffer::sharingMode):
* runtime/JSArrayBuffer.h:
(JSC::toPossiblySharedArrayBuffer):
(JSC::toUnsharedArrayBuffer):
(JSC::JSArrayBuffer::toWrapped):
(JSC::toArrayBuffer): Deleted.
* runtime/JSArrayBufferConstructor.cpp:
(JSC::JSArrayBufferConstructor::JSArrayBufferConstructor):
(JSC::JSArrayBufferConstructor::finishCreation):
(JSC::JSArrayBufferConstructor::create):
(JSC::constructArrayBuffer):
* runtime/JSArrayBufferConstructor.h:
(JSC::JSArrayBufferConstructor::sharingMode):
* runtime/JSArrayBufferPrototype.cpp:
(JSC::arrayBufferProtoFuncSlice):
(JSC::JSArrayBufferPrototype::JSArrayBufferPrototype):
(JSC::JSArrayBufferPrototype::finishCreation):
(JSC::JSArrayBufferPrototype::create):
* runtime/JSArrayBufferPrototype.h:
* runtime/JSArrayBufferView.cpp:
(JSC::JSArrayBufferView::finishCreation):
(JSC::JSArrayBufferView::visitChildren):
(JSC::JSArrayBufferView::unsharedBuffer):
(JSC::JSArrayBufferView::unsharedJSBuffer):
(JSC::JSArrayBufferView::possiblySharedJSBuffer):
(JSC::JSArrayBufferView::neuter):
(JSC::JSArrayBufferView::toWrapped): Deleted.
* runtime/JSArrayBufferView.h:
(JSC::JSArrayBufferView::jsBuffer): Deleted.
* runtime/JSArrayBufferViewInlines.h:
(JSC::JSArrayBufferView::isShared):
(JSC::JSArrayBufferView::possiblySharedBuffer):
(JSC::JSArrayBufferView::possiblySharedImpl):
(JSC::JSArrayBufferView::unsharedImpl):
(JSC::JSArrayBufferView::byteOffset):
(JSC::JSArrayBufferView::toWrapped):
(JSC::JSArrayBufferView::buffer): Deleted.
(JSC::JSArrayBufferView::impl): Deleted.
(JSC::JSArrayBufferView::neuter): Deleted.
* runtime/JSDataView.cpp:
(JSC::JSDataView::possiblySharedTypedImpl):
(JSC::JSDataView::unsharedTypedImpl):
(JSC::JSDataView::getTypedArrayImpl):
(JSC::JSDataView::typedImpl): Deleted.
* runtime/JSDataView.h:
(JSC::JSDataView::possiblySharedBuffer):
(JSC::JSDataView::unsharedBuffer):
(JSC::JSDataView::buffer): Deleted.
* runtime/JSDataViewPrototype.cpp:
(JSC::dataViewProtoGetterBuffer):
* runtime/JSGenericTypedArrayView.h:
(JSC::toPossiblySharedNativeTypedView):
(JSC::toUnsharedNativeTypedView):
(JSC::JSGenericTypedArrayView<Adaptor>::toWrapped):
(JSC::JSGenericTypedArrayView::typedImpl): Deleted.
(JSC::toNativeTypedView): Deleted.
* runtime/JSGenericTypedArrayViewInlines.h:
(JSC::JSGenericTypedArrayView<Adaptor>::create):
(JSC::JSGenericTypedArrayView<Adaptor>::possiblySharedTypedImpl):
(JSC::JSGenericTypedArrayView<Adaptor>::unsharedTypedImpl):
(JSC::JSGenericTypedArrayView<Adaptor>::getTypedArrayImpl):
* runtime/JSGenericTypedArrayViewPrototypeFunctions.h:
(JSC::genericTypedArrayViewProtoGetterFuncBuffer):
(JSC::genericTypedArrayViewPrivateFuncSubarrayCreate):
* runtime/JSGlobalObject.cpp:
(JSC::createAtomicsProperty):
(JSC::JSGlobalObject::init):
(JSC::JSGlobalObject::visitChildren):
* runtime/JSGlobalObject.h:
(JSC::JSGlobalObject::arrayBufferPrototype):
(JSC::JSGlobalObject::arrayBufferStructure):
* runtime/MathObject.cpp:
* runtime/RuntimeFlags.h:
* runtime/SimpleTypedArrayController.cpp:
(JSC::SimpleTypedArrayController::toJS):
* runtime/TypedArrayType.h:
(JSC::typedArrayTypeForType):
Source/WebCore:
New tests added in the LayoutTests/workers/sab directory.
This teaches WebCore that a typed array could be shared or not. By default, WebCore will
reject shared typed arrays as if they were not typed arrays. This ensures that we don't get
race conditions in code that can't handle it.
If you postMessage a SharedArrayBuffer or something that wraps it, you will send the shared
memory to the other worker.
* Modules/encryptedmedia/CDMSessionClearKey.cpp:
(WebCore::CDMSessionClearKey::cachedKeyForKeyID):
* Modules/fetch/FetchBody.cpp:
(WebCore::FetchBody::extract):
* Modules/mediastream/RTCDataChannel.cpp:
(WebCore::RTCDataChannel::send):
* Modules/webaudio/AudioBuffer.cpp:
(WebCore::AudioBuffer::getChannelData):
* Modules/websockets/WebSocket.cpp:
(WebCore::WebSocket::send):
* bindings/js/JSBlobCustom.cpp:
(WebCore::constructJSBlob):
* bindings/js/JSCryptoAlgorithmDictionary.cpp:
(WebCore::createRsaKeyGenParams):
* bindings/js/JSCryptoCustom.cpp:
(WebCore::JSCrypto::getRandomValues):
* bindings/js/JSCryptoOperationData.cpp:
(WebCore::cryptoOperationDataFromJSValue):
* bindings/js/JSDOMBinding.h:
(WebCore::toJS):
(WebCore::toPossiblySharedArrayBufferView):
(WebCore::toUnsharedArrayBufferView):
(WebCore::toPossiblySharedInt8Array):
(WebCore::toPossiblySharedInt16Array):
(WebCore::toPossiblySharedInt32Array):
(WebCore::toPossiblySharedUint8Array):
(WebCore::toPossiblySharedUint8ClampedArray):
(WebCore::toPossiblySharedUint16Array):
(WebCore::toPossiblySharedUint32Array):
(WebCore::toPossiblySharedFloat32Array):
(WebCore::toPossiblySharedFloat64Array):
(WebCore::toUnsharedInt8Array):
(WebCore::toUnsharedInt16Array):
(WebCore::toUnsharedInt32Array):
(WebCore::toUnsharedUint8Array):
(WebCore::toUnsharedUint8ClampedArray):
(WebCore::toUnsharedUint16Array):
(WebCore::toUnsharedUint32Array):
(WebCore::toUnsharedFloat32Array):
(WebCore::toUnsharedFloat64Array):
(WebCore::toArrayBufferView): Deleted.
(WebCore::toInt8Array): Deleted.
(WebCore::toInt16Array): Deleted.
(WebCore::toInt32Array): Deleted.
(WebCore::toUint8Array): Deleted.
(WebCore::toUint8ClampedArray): Deleted.
(WebCore::toUint16Array): Deleted.
(WebCore::toUint32Array): Deleted.
(WebCore::toFloat32Array): Deleted.
(WebCore::toFloat64Array): Deleted.
* bindings/js/JSDataCueCustom.cpp:
(WebCore::constructJSDataCue):
* bindings/js/JSDictionary.cpp:
(WebCore::JSDictionary::convertValue):
* bindings/js/JSFileCustom.cpp:
(WebCore::constructJSFile):
* bindings/js/JSMessagePortCustom.cpp:
(WebCore::extractTransferables):
* bindings/js/JSWebGLRenderingContextBaseCustom.cpp:
(WebCore::dataFunctionf):
(WebCore::dataFunctioni):
(WebCore::dataFunctionMatrix):
* bindings/js/JSXMLHttpRequestCustom.cpp:
(WebCore::JSXMLHttpRequest::send):
* bindings/js/SerializedScriptValue.cpp:
(WebCore::CloneSerializer::dumpArrayBufferView):
(WebCore::CloneSerializer::dumpIfTerminal):
(WebCore::CloneDeserializer::readArrayBufferView):
(WebCore::CloneDeserializer::readTerminal):
(WebCore::SerializedScriptValue::transferArrayBuffers):
* bindings/js/StructuredClone.cpp:
(WebCore::structuredCloneArrayBuffer):
(WebCore::structuredCloneArrayBufferView):
* bindings/scripts/CodeGeneratorJS.pm:
(JSValueToNative):
* css/FontFace.cpp:
(WebCore::FontFace::create):
* html/canvas/WebGL2RenderingContext.cpp:
(WebCore::WebGL2RenderingContext::bufferData):
(WebCore::WebGL2RenderingContext::bufferSubData):
* platform/graphics/avfoundation/MediaPlayerPrivateAVFoundation.cpp:
(WebCore::MediaPlayerPrivateAVFoundation::extractKeyURIKeyIDAndCertificateFromInitData):
Source/WebKit/mac:
Support the RuntimeFlag.
* WebView/WebPreferencesPrivate.h:
Source/WebKit/win:
Support the RuntimeFlag.
* Interfaces/IWebPreferencesPrivate.idl:
Source/WebKit2:
Adds some small things we need for SharedArrayBuffer.
* UIProcess/API/C/WKPreferencesRefPrivate.h:
* UIProcess/API/Cocoa/WKPreferencesPrivate.h:
* WebProcess/InjectedBundle/InjectedBundle.cpp:
(WebKit::InjectedBundle::createWebDataFromUint8Array):
Source/WTF:
Adds some small things we need for SharedArrayBuffer.
* wtf/Atomics.h:
(WTF::Atomic::compareExchangeWeakRelaxed):
(WTF::Atomic::exchangeAdd):
(WTF::Atomic::exchangeAnd):
(WTF::Atomic::exchangeOr):
(WTF::Atomic::exchangeSub):
(WTF::Atomic::exchangeXor):
(WTF::atomicLoad):
(WTF::atomicStore):
(WTF::atomicCompareExchangeWeak):
(WTF::atomicCompareExchangeWeakRelaxed):
(WTF::atomicCompareExchangeStrong):
(WTF::atomicExchangeAdd):
(WTF::atomicExchangeAnd):
(WTF::atomicExchangeOr):
(WTF::atomicExchangeSub):
(WTF::atomicExchangeXor):
(WTF::atomicExchange):
(WTF::Atomic::exchangeAndAdd): Deleted.
(WTF::weakCompareAndSwap): Deleted.
We need to be able to do atomics operations on naked pointers. We also need to be able to do
all of the things that std::atomic does. This adds those things and renames
weakCompareAndSwap to atomicCompareExchangeWeakRelaxed so that we're using consistent
terminology.
* wtf/Bitmap.h:
(WTF::WordType>::concurrentTestAndSet): Renamed weakCompareAndSwap.
(WTF::WordType>::concurrentTestAndClear): Renamed weakCompareAndSwap.
* wtf/FastBitVector.h:
(WTF::FastBitVector::atomicSetAndCheck): Renamed weakCompareAndSwap.
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkCount):
* wtf/ParkingLot.h:
Added unparkCount(), which lets you unpark some bounded number of threads and returns the
number of threads unparked. This is just a modest extension of unparkAll(). unparkAll() now
just calls unparkCount(ptr, UINT_MAX).
Tools:
Use the right kind of typed array API.
* DumpRenderTree/TestRunner.cpp:
(setAudioResultCallback):
LayoutTests:
Adding tests. This is a work in progress.
* workers/sab: Added.
* workers/sab/simple-worker-1.js: Added.
(onmessage):
* workers/sab/simple-worker-2.js: Added.
(onmessage):
* workers/sab/simple.html: Added.
Canonical link: https://commits.webkit.org/181984@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208209 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-01 03:10:00 +00:00
// FIXME: It seems like this ought to be EnsureNonEmpty if we follow what unparkOne() does,
// but that seems wrong.
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
BucketMode : : IgnoreEmpty ,
WTF::Lock should be fair eventually
https://bugs.webkit.org/show_bug.cgi?id=159384
Reviewed by Geoffrey Garen.
Source/WTF:
In https://webkit.org/blog/6161/locking-in-webkit/ we showed how relaxing the fairness of
locks makes them fast. That post presented lock fairness as a trade-off between two
extremes:
- Barging. A barging lock, like WTF::Lock, releases the lock in unlock() even if there was a
thread on the queue. If there was a thread on the queue, the lock is released and that
thread is made runnable. That thread may then grab the lock, or some other thread may grab
the lock first (it may barge). Usually, the barging thread is the thread that released the
lock in the first place. This maximizes throughput but hurts fairness. There is no good
theoretical bound on how unfair the lock may become, but empirical data suggests that it's
fair enough for the cases we previously measured.
- FIFO. A FIFO lock, like HandoffLock in ToyLocks.h, does not release the lock in unlock()
if there is a thread waiting. If there is a thread waiting, unlock() will make that thread
runnable and inform it that it now holds the lock. This ensures perfect round-robin
fairness and allows us to reason theoretically about how long it may take for a thread to
grab the lock. For example, if we know that only N threads are running and each one may
contend on a critical section, and each one may hold the lock for at most S seconds, then
the time it takes to grab the lock is N * S. Unfortunately, FIFO locks perform very badly
in most cases. This is because for the common case of short critical sections, they force
a context switch after each critical section if the lock is contended.
This change makes WTF::Lock almost as fair as FIFO while still being as fast as barging.
Thanks to this new algorithm, you can now have both of these things at the same time.
This change makes WTF::Lock eventually fair. We can almost (more on the caveats below)
guarantee that the time it takes to grab a lock is N * max(1ms, S). In other words, critical
sections that are longer than 1ms are always fair. For shorter critical sections, the amount
of time that any thread waits is 1ms times the number of threads. There are some caveats
that arise from our use of randomness, but even then, in the limit as the critical section
length goes to infinity, the lock becomes fair. The corner cases are unlikely to happen; our
experiments show that the lock becomes exactly as fair as a FIFO lock for any critical
section that is 1ms or longer.
The fairness mechanism is broken into two parts. WTF::Lock can now choose to unlock a lock
fairly or unfairly thanks to the new ParkingLot token mechanism. WTF::Lock knows when to use
fair unlocking based on a timeout mechanism in ParkingLot called timeToBeFair.
ParkingLot::unparkOne() and ParkingLot::parkConditionally() can now communicate with each
other via a token. unparkOne() can pass a token, which parkConditionally() will return. This
change also makes parkConditionally() a lot more precise about when it was unparked due to a
call to unparkOne(). If unparkOne() is told that a thread was unparked then this thread is
guaranteed to report that it was unparked rather than timing out, and that thread is
guaranteed to get the token that unparkOne() passed. The token is an intptr_t. We use it as
a boolean variable in WTF::Lock, but you could use it to pass arbitrary data structures. By
default, the token is zero. WTF::Lock's unlock() will pass 1 as the token if it is doing
fair unlocking. In that case, unlock() will not release the lock, and lock() will know that
it holds the lock as soon as parkConditionally() returns. Note that this algorithm relies
on unparkOne() invoking WTF::Lock's callback while the queue lock is held, so that WTF::Lock
can make a decision about unlock strategy and inject a token while it has complete knowledge
over the state of the queue. As such, it's not immediately obvious how to implement this
algorithm on top of futexes. You really need ParkingLot!
WTF::Lock does not use fair unlocking every time. We expose a new API, Lock::unlockFairly(),
which forces the fair unlocking behavior. Additionally, ParkingLot now maintains a
per-bucket stochastic fairness timeout. When the timeout fires, the unparkOne() callback
sees UnparkResult::timeToBeFair = true. This timeout is set to be anywhere from 0ms to 1ms
at random. When a dequeue happens and there are threads that actually get dequeued, we check
if the time since the last unfair unlock (the last time timeToBeFair was set to true) is
more than the timeout amount. If so, then we set timeToBeFair to true and reset the timeout.
This means that in the absence of ParkingLot collisions, unfair unlocking is guaranteed to
happen at least once per millisecond. It will happen at 2 KHz on average. If there are
collisions, then each collision adds one millisecond to the worst case (and 0.5 ms to the
average case). The reason why we don't just use a fixed 1ms timeout is that we want to avoid
resonance. Imagine a program in which some thread acquires a lock at 1 KHz in-phase with the
timeToBeFair timeout. Then this thread would be the benefactor of fairness to the detriment
of everyone else. Randomness ensures that we aren't too fair to any one thread.
Empirically, this is neutral on our major benchmarks like JetStream but it's an enormous
improvement in LockFairnessTest. It's common for an unfair lock (either our BargingLock, the
old WTF::Lock, any of the other futex-based locks that barge, or new os_unfair_lock) to
allow only one thread to hold the lock during a whole second in which each thread is holding
the lock for 1ms at a time. This is because in a barging lock, releasing a lock after
holding it for 1ms and then reacquiring it immediately virtually ensures that none of the
other threads can wake up in time to grab it before it's relocked. But the new WTF::Lock
handles this case like a champ: each thread gets equal turns.
Here's some data. If we launch 10 threads and have each of them run for 1 second while
repeatedly holding a critical section for 1ms, then here's how many times each thread gets
to hold the lock using the old WTF::Lock algorithm:
799, 6, 1, 1, 1, 1, 1, 1, 1, 1
One thread hogged the lock for almost the whole time! With the new WTF::Lock, the lock
becomes totally fair:
80, 79, 79, 79, 79, 79, 79, 80, 80, 79
I don't know of anyone creating such an automatically-fair adaptive lock before, so I think
that this is a pretty awesome advancement to the state of the art!
This change is good for three reasons:
- We do have long critical sections in WebKit and we don't want to have to worry about
starvation. This reduces the likelihood that we will see starvation due to our lock
strategy.
- I was talking to ggaren about bmalloc's locking needs, and he wanted unlockFairly() or
lockFairly() or some moral equivalent for the scavenger thread.
- If we use a WTF::Lock to manage heap access in a multithreaded GC, we'll need the ability
to unlock and relock without barging.
* benchmarks/LockFairnessTest.cpp:
(main):
* benchmarks/ToyLocks.h:
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::notifyOne):
* wtf/Lock.cpp:
(WTF::LockBase::lockSlow):
(WTF::LockBase::unlockSlow):
(WTF::LockBase::unlockFairlySlow):
(WTF::LockBase::unlockSlowImpl):
* wtf/Lock.h:
(WTF::LockBase::try_lock):
(WTF::LockBase::unlock):
(WTF::LockBase::unlockFairly):
(WTF::LockBase::isHeld):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
(WTF::ParkingLot::unparkOne):
Tools:
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/178039@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@203350 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-07-18 18:32:52 +00:00
[ & ] ( ThreadData * element , bool ) {
2015-08-11 19:51:35 +00:00
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : Observing element with address = " , RawPointer ( element - > address ) , " \n " ) ) ;
2015-08-11 19:51:35 +00:00
if ( element - > address ! = address )
return DequeueResult : : Ignore ;
threadDatas . append ( element ) ;
JSC should support SharedArrayBuffer
https://bugs.webkit.org/show_bug.cgi?id=163986
Reviewed by Keith Miller.
JSTests:
This adds our own test for the various corner cases of SharedArrayBuffer. This test is meant to
check all of the things that don't require concurrency.
* stress/SharedArrayBuffer.js: Added.
(checkAtomics):
(shouldFail):
(Symbol):
(runAtomic):
Source/JavaScriptCore:
This implements https://tc39.github.io/ecmascript_sharedmem/shmem.html.
There is now a new SharedArrayBuffer type. In the JS runtime, which includes typed array
types, the SharedArrayBuffer is a drop-in replacement for ArrayBuffer, even though they are
distinct types (new SharedArrayBuffer() instanceof ArrayBuffer == false and vice versa). The
DOM will not recognize SharedArrayBuffer, or any typed array that wraps it, to ensure safety.
This matches what other browsers intend to do, see
https://github.com/tc39/ecmascript_sharedmem/issues/38. API is provided for the DOM to opt
into SharedArrayBuffer. One notable place is postMessage, which will share the
SharedArrayBuffer's underlying data storage with other workers. This creates a pool of shared
memory that the workers can use to talk to each other.
There is also an Atomics object in global scope, which exposes sequentially consistent atomic
operations: add, and, compareExchange, exchange, load, or, store, sub, and xor. Additionally
it exposes a Atomics.isLockFree utility, which takes a byte amount and returns true or false.
Also there is Atomics.wake/wait, which neatly map to ParkingLot.
Accesses to typed arrays that wrap SharedArrayBuffer are optimized by JSC the same way as
always. I believe that DFG and B3 already obey the following memory model, which I believe is
a bit weaker than Cambridge and a bit stronger than what is being proposed for
SharedArrayBuffer. To predict a program's behavior under the B3 memory model, imagine the
space of all possible programs that would result from running an optimizer that adversarially
follows B3's transformation rules. B3 transformations are correct if the newly created
program is equivalent to the old one, assuming that any opaque effect in IR (like the reads
and writes of a patchpoint/call/fence) could perform any load/store that satisfies the
B3::Effects summary. Opaque effects are a way of describing an infinite set of programs: any
program that only does the effects summarized in B3::Effects belongs to the set. For example,
this prevents motion of operations across fences since fences are summarized as opaque
effects that could read or write memory. This rule alone is not enough, because it leaves the
door open for turning an atomic operation (like a load) into a non-atomic one (like a load
followed by a store of the same value back to the same location or multiple loads). This is
not an optimization that either our compiler or the CPU would want to do. One way to think of
what exactly is forbidden is that B3 transformations that mess with memory accesses can only
reorder them or remove them. This means that for any execution of the untransformed program,
the corresponding execution of the transformed program (i.e. with the same input arguments
and the same programs filled in for the opaque effects) must have the same loads and stores,
with some removed and some reordered. This is a fairly simple mental model that B3 and DFG
already follow and it's based on existing abstractions for the infinite set of programs
inside an opaque effect (DFG's AbstractHeaps and B3's Effects).
This patch makes all atomics operations intrinsic, but the DFG doesn't know about any of them
yet. That's covered by bug 164108.
This ought to be perf-neutral, but I am still running tests to confirm this. I'm also still
writing new tests to cover all of the Atomics functionality and the behavior of SAB objects.
* API/JSTypedArray.cpp:
(JSObjectGetTypedArrayBytesPtr):
(JSObjectGetTypedArrayBuffer):
(JSObjectMakeArrayBufferWithBytesNoCopy):
* API/tests/CompareAndSwapTest.cpp:
(Bitmap::concurrentTestAndSet):
* CMakeLists.txt:
* JavaScriptCore.xcodeproj/project.pbxproj:
* dfg/DFGDesiredWatchpoints.cpp:
(JSC::DFG::ArrayBufferViewWatchpointAdaptor::add):
* heap/Heap.cpp:
(JSC::Heap::reportExtraMemoryVisited):
(JSC::Heap::reportExternalMemoryVisited):
* jsc.cpp:
(functionTransferArrayBuffer):
* runtime/ArrayBuffer.cpp:
(JSC::SharedArrayBufferContents::SharedArrayBufferContents):
(JSC::SharedArrayBufferContents::~SharedArrayBufferContents):
(JSC::ArrayBufferContents::ArrayBufferContents):
(JSC::ArrayBufferContents::operator=):
(JSC::ArrayBufferContents::~ArrayBufferContents):
(JSC::ArrayBufferContents::clear):
(JSC::ArrayBufferContents::destroy):
(JSC::ArrayBufferContents::reset):
(JSC::ArrayBufferContents::tryAllocate):
(JSC::ArrayBufferContents::makeShared):
(JSC::ArrayBufferContents::transferTo):
(JSC::ArrayBufferContents::copyTo):
(JSC::ArrayBufferContents::shareWith):
(JSC::ArrayBuffer::create):
(JSC::ArrayBuffer::createAdopted):
(JSC::ArrayBuffer::createFromBytes):
(JSC::ArrayBuffer::tryCreate):
(JSC::ArrayBuffer::createUninitialized):
(JSC::ArrayBuffer::tryCreateUninitialized):
(JSC::ArrayBuffer::createInternal):
(JSC::ArrayBuffer::ArrayBuffer):
(JSC::ArrayBuffer::slice):
(JSC::ArrayBuffer::sliceImpl):
(JSC::ArrayBuffer::makeShared):
(JSC::ArrayBuffer::setSharingMode):
(JSC::ArrayBuffer::transferTo):
(JSC::ArrayBuffer::transfer): Deleted.
* runtime/ArrayBuffer.h:
(JSC::arrayBufferSharingModeName):
(JSC::SharedArrayBufferContents::data):
(JSC::ArrayBufferContents::data):
(JSC::ArrayBufferContents::sizeInBytes):
(JSC::ArrayBufferContents::isShared):
(JSC::ArrayBuffer::sharingMode):
(JSC::ArrayBuffer::isShared):
(JSC::ArrayBuffer::gcSizeEstimateInBytes):
(JSC::arrayBufferDestructorNull): Deleted.
(JSC::arrayBufferDestructorDefault): Deleted.
(JSC::ArrayBufferContents::ArrayBufferContents): Deleted.
(JSC::ArrayBufferContents::transfer): Deleted.
(JSC::ArrayBufferContents::copyTo): Deleted.
(JSC::ArrayBuffer::create): Deleted.
(JSC::ArrayBuffer::createAdopted): Deleted.
(JSC::ArrayBuffer::createFromBytes): Deleted.
(JSC::ArrayBuffer::tryCreate): Deleted.
(JSC::ArrayBuffer::createUninitialized): Deleted.
(JSC::ArrayBuffer::tryCreateUninitialized): Deleted.
(JSC::ArrayBuffer::createInternal): Deleted.
(JSC::ArrayBuffer::ArrayBuffer): Deleted.
(JSC::ArrayBuffer::slice): Deleted.
(JSC::ArrayBuffer::sliceImpl): Deleted.
(JSC::ArrayBufferContents::tryAllocate): Deleted.
(JSC::ArrayBufferContents::~ArrayBufferContents): Deleted.
* runtime/ArrayBufferSharingMode.h: Added.
* runtime/ArrayBufferView.h:
(JSC::ArrayBufferView::possiblySharedBuffer):
(JSC::ArrayBufferView::unsharedBuffer):
(JSC::ArrayBufferView::isShared):
(JSC::ArrayBufferView::buffer): Deleted.
* runtime/AtomicsObject.cpp: Added.
(JSC::AtomicsObject::AtomicsObject):
(JSC::AtomicsObject::create):
(JSC::AtomicsObject::createStructure):
(JSC::AtomicsObject::finishCreation):
(JSC::atomicsFuncAdd):
(JSC::atomicsFuncAnd):
(JSC::atomicsFuncCompareExchange):
(JSC::atomicsFuncExchange):
(JSC::atomicsFuncIsLockFree):
(JSC::atomicsFuncLoad):
(JSC::atomicsFuncOr):
(JSC::atomicsFuncStore):
(JSC::atomicsFuncSub):
(JSC::atomicsFuncWait):
(JSC::atomicsFuncWake):
(JSC::atomicsFuncXor):
* runtime/AtomicsObject.h: Added.
* runtime/CommonIdentifiers.h:
* runtime/DataView.cpp:
(JSC::DataView::wrap):
* runtime/GenericTypedArrayViewInlines.h:
(JSC::GenericTypedArrayView<Adaptor>::subarray):
* runtime/Intrinsic.h:
* runtime/JSArrayBuffer.cpp:
(JSC::JSArrayBuffer::finishCreation):
(JSC::JSArrayBuffer::isShared):
(JSC::JSArrayBuffer::sharingMode):
* runtime/JSArrayBuffer.h:
(JSC::toPossiblySharedArrayBuffer):
(JSC::toUnsharedArrayBuffer):
(JSC::JSArrayBuffer::toWrapped):
(JSC::toArrayBuffer): Deleted.
* runtime/JSArrayBufferConstructor.cpp:
(JSC::JSArrayBufferConstructor::JSArrayBufferConstructor):
(JSC::JSArrayBufferConstructor::finishCreation):
(JSC::JSArrayBufferConstructor::create):
(JSC::constructArrayBuffer):
* runtime/JSArrayBufferConstructor.h:
(JSC::JSArrayBufferConstructor::sharingMode):
* runtime/JSArrayBufferPrototype.cpp:
(JSC::arrayBufferProtoFuncSlice):
(JSC::JSArrayBufferPrototype::JSArrayBufferPrototype):
(JSC::JSArrayBufferPrototype::finishCreation):
(JSC::JSArrayBufferPrototype::create):
* runtime/JSArrayBufferPrototype.h:
* runtime/JSArrayBufferView.cpp:
(JSC::JSArrayBufferView::finishCreation):
(JSC::JSArrayBufferView::visitChildren):
(JSC::JSArrayBufferView::unsharedBuffer):
(JSC::JSArrayBufferView::unsharedJSBuffer):
(JSC::JSArrayBufferView::possiblySharedJSBuffer):
(JSC::JSArrayBufferView::neuter):
(JSC::JSArrayBufferView::toWrapped): Deleted.
* runtime/JSArrayBufferView.h:
(JSC::JSArrayBufferView::jsBuffer): Deleted.
* runtime/JSArrayBufferViewInlines.h:
(JSC::JSArrayBufferView::isShared):
(JSC::JSArrayBufferView::possiblySharedBuffer):
(JSC::JSArrayBufferView::possiblySharedImpl):
(JSC::JSArrayBufferView::unsharedImpl):
(JSC::JSArrayBufferView::byteOffset):
(JSC::JSArrayBufferView::toWrapped):
(JSC::JSArrayBufferView::buffer): Deleted.
(JSC::JSArrayBufferView::impl): Deleted.
(JSC::JSArrayBufferView::neuter): Deleted.
* runtime/JSDataView.cpp:
(JSC::JSDataView::possiblySharedTypedImpl):
(JSC::JSDataView::unsharedTypedImpl):
(JSC::JSDataView::getTypedArrayImpl):
(JSC::JSDataView::typedImpl): Deleted.
* runtime/JSDataView.h:
(JSC::JSDataView::possiblySharedBuffer):
(JSC::JSDataView::unsharedBuffer):
(JSC::JSDataView::buffer): Deleted.
* runtime/JSDataViewPrototype.cpp:
(JSC::dataViewProtoGetterBuffer):
* runtime/JSGenericTypedArrayView.h:
(JSC::toPossiblySharedNativeTypedView):
(JSC::toUnsharedNativeTypedView):
(JSC::JSGenericTypedArrayView<Adaptor>::toWrapped):
(JSC::JSGenericTypedArrayView::typedImpl): Deleted.
(JSC::toNativeTypedView): Deleted.
* runtime/JSGenericTypedArrayViewInlines.h:
(JSC::JSGenericTypedArrayView<Adaptor>::create):
(JSC::JSGenericTypedArrayView<Adaptor>::possiblySharedTypedImpl):
(JSC::JSGenericTypedArrayView<Adaptor>::unsharedTypedImpl):
(JSC::JSGenericTypedArrayView<Adaptor>::getTypedArrayImpl):
* runtime/JSGenericTypedArrayViewPrototypeFunctions.h:
(JSC::genericTypedArrayViewProtoGetterFuncBuffer):
(JSC::genericTypedArrayViewPrivateFuncSubarrayCreate):
* runtime/JSGlobalObject.cpp:
(JSC::createAtomicsProperty):
(JSC::JSGlobalObject::init):
(JSC::JSGlobalObject::visitChildren):
* runtime/JSGlobalObject.h:
(JSC::JSGlobalObject::arrayBufferPrototype):
(JSC::JSGlobalObject::arrayBufferStructure):
* runtime/MathObject.cpp:
* runtime/RuntimeFlags.h:
* runtime/SimpleTypedArrayController.cpp:
(JSC::SimpleTypedArrayController::toJS):
* runtime/TypedArrayType.h:
(JSC::typedArrayTypeForType):
Source/WebCore:
New tests added in the LayoutTests/workers/sab directory.
This teaches WebCore that a typed array could be shared or not. By default, WebCore will
reject shared typed arrays as if they were not typed arrays. This ensures that we don't get
race conditions in code that can't handle it.
If you postMessage a SharedArrayBuffer or something that wraps it, you will send the shared
memory to the other worker.
* Modules/encryptedmedia/CDMSessionClearKey.cpp:
(WebCore::CDMSessionClearKey::cachedKeyForKeyID):
* Modules/fetch/FetchBody.cpp:
(WebCore::FetchBody::extract):
* Modules/mediastream/RTCDataChannel.cpp:
(WebCore::RTCDataChannel::send):
* Modules/webaudio/AudioBuffer.cpp:
(WebCore::AudioBuffer::getChannelData):
* Modules/websockets/WebSocket.cpp:
(WebCore::WebSocket::send):
* bindings/js/JSBlobCustom.cpp:
(WebCore::constructJSBlob):
* bindings/js/JSCryptoAlgorithmDictionary.cpp:
(WebCore::createRsaKeyGenParams):
* bindings/js/JSCryptoCustom.cpp:
(WebCore::JSCrypto::getRandomValues):
* bindings/js/JSCryptoOperationData.cpp:
(WebCore::cryptoOperationDataFromJSValue):
* bindings/js/JSDOMBinding.h:
(WebCore::toJS):
(WebCore::toPossiblySharedArrayBufferView):
(WebCore::toUnsharedArrayBufferView):
(WebCore::toPossiblySharedInt8Array):
(WebCore::toPossiblySharedInt16Array):
(WebCore::toPossiblySharedInt32Array):
(WebCore::toPossiblySharedUint8Array):
(WebCore::toPossiblySharedUint8ClampedArray):
(WebCore::toPossiblySharedUint16Array):
(WebCore::toPossiblySharedUint32Array):
(WebCore::toPossiblySharedFloat32Array):
(WebCore::toPossiblySharedFloat64Array):
(WebCore::toUnsharedInt8Array):
(WebCore::toUnsharedInt16Array):
(WebCore::toUnsharedInt32Array):
(WebCore::toUnsharedUint8Array):
(WebCore::toUnsharedUint8ClampedArray):
(WebCore::toUnsharedUint16Array):
(WebCore::toUnsharedUint32Array):
(WebCore::toUnsharedFloat32Array):
(WebCore::toUnsharedFloat64Array):
(WebCore::toArrayBufferView): Deleted.
(WebCore::toInt8Array): Deleted.
(WebCore::toInt16Array): Deleted.
(WebCore::toInt32Array): Deleted.
(WebCore::toUint8Array): Deleted.
(WebCore::toUint8ClampedArray): Deleted.
(WebCore::toUint16Array): Deleted.
(WebCore::toUint32Array): Deleted.
(WebCore::toFloat32Array): Deleted.
(WebCore::toFloat64Array): Deleted.
* bindings/js/JSDataCueCustom.cpp:
(WebCore::constructJSDataCue):
* bindings/js/JSDictionary.cpp:
(WebCore::JSDictionary::convertValue):
* bindings/js/JSFileCustom.cpp:
(WebCore::constructJSFile):
* bindings/js/JSMessagePortCustom.cpp:
(WebCore::extractTransferables):
* bindings/js/JSWebGLRenderingContextBaseCustom.cpp:
(WebCore::dataFunctionf):
(WebCore::dataFunctioni):
(WebCore::dataFunctionMatrix):
* bindings/js/JSXMLHttpRequestCustom.cpp:
(WebCore::JSXMLHttpRequest::send):
* bindings/js/SerializedScriptValue.cpp:
(WebCore::CloneSerializer::dumpArrayBufferView):
(WebCore::CloneSerializer::dumpIfTerminal):
(WebCore::CloneDeserializer::readArrayBufferView):
(WebCore::CloneDeserializer::readTerminal):
(WebCore::SerializedScriptValue::transferArrayBuffers):
* bindings/js/StructuredClone.cpp:
(WebCore::structuredCloneArrayBuffer):
(WebCore::structuredCloneArrayBufferView):
* bindings/scripts/CodeGeneratorJS.pm:
(JSValueToNative):
* css/FontFace.cpp:
(WebCore::FontFace::create):
* html/canvas/WebGL2RenderingContext.cpp:
(WebCore::WebGL2RenderingContext::bufferData):
(WebCore::WebGL2RenderingContext::bufferSubData):
* platform/graphics/avfoundation/MediaPlayerPrivateAVFoundation.cpp:
(WebCore::MediaPlayerPrivateAVFoundation::extractKeyURIKeyIDAndCertificateFromInitData):
Source/WebKit/mac:
Support the RuntimeFlag.
* WebView/WebPreferencesPrivate.h:
Source/WebKit/win:
Support the RuntimeFlag.
* Interfaces/IWebPreferencesPrivate.idl:
Source/WebKit2:
Adds some small things we need for SharedArrayBuffer.
* UIProcess/API/C/WKPreferencesRefPrivate.h:
* UIProcess/API/Cocoa/WKPreferencesPrivate.h:
* WebProcess/InjectedBundle/InjectedBundle.cpp:
(WebKit::InjectedBundle::createWebDataFromUint8Array):
Source/WTF:
Adds some small things we need for SharedArrayBuffer.
* wtf/Atomics.h:
(WTF::Atomic::compareExchangeWeakRelaxed):
(WTF::Atomic::exchangeAdd):
(WTF::Atomic::exchangeAnd):
(WTF::Atomic::exchangeOr):
(WTF::Atomic::exchangeSub):
(WTF::Atomic::exchangeXor):
(WTF::atomicLoad):
(WTF::atomicStore):
(WTF::atomicCompareExchangeWeak):
(WTF::atomicCompareExchangeWeakRelaxed):
(WTF::atomicCompareExchangeStrong):
(WTF::atomicExchangeAdd):
(WTF::atomicExchangeAnd):
(WTF::atomicExchangeOr):
(WTF::atomicExchangeSub):
(WTF::atomicExchangeXor):
(WTF::atomicExchange):
(WTF::Atomic::exchangeAndAdd): Deleted.
(WTF::weakCompareAndSwap): Deleted.
We need to be able to do atomics operations on naked pointers. We also need to be able to do
all of the things that std::atomic does. This adds those things and renames
weakCompareAndSwap to atomicCompareExchangeWeakRelaxed so that we're using consistent
terminology.
* wtf/Bitmap.h:
(WTF::WordType>::concurrentTestAndSet): Renamed weakCompareAndSwap.
(WTF::WordType>::concurrentTestAndClear): Renamed weakCompareAndSwap.
* wtf/FastBitVector.h:
(WTF::FastBitVector::atomicSetAndCheck): Renamed weakCompareAndSwap.
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkCount):
* wtf/ParkingLot.h:
Added unparkCount(), which lets you unpark some bounded number of threads and returns the
number of threads unparked. This is just a modest extension of unparkAll(). unparkAll() now
just calls unparkCount(ptr, UINT_MAX).
Tools:
Use the right kind of typed array API.
* DumpRenderTree/TestRunner.cpp:
(setAudioResultCallback):
LayoutTests:
Adding tests. This is a work in progress.
* workers/sab: Added.
* workers/sab/simple-worker-1.js: Added.
(onmessage):
* workers/sab/simple-worker-2.js: Added.
(onmessage):
* workers/sab/simple.html: Added.
Canonical link: https://commits.webkit.org/181984@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208209 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-01 03:10:00 +00:00
if ( threadDatas . size ( ) = = count )
return DequeueResult : : RemoveAndStop ;
2015-08-11 19:51:35 +00:00
return DequeueResult : : RemoveAndContinue ;
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
} ,
[ ] ( bool ) { } ) ;
2015-08-11 19:51:35 +00:00
2016-09-13 15:59:25 +00:00
for ( RefPtr < ThreadData > & threadData : threadDatas ) {
2015-08-11 19:51:35 +00:00
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : unparking " , RawPointer ( threadData . get ( ) ) , " with address " , RawPointer ( threadData - > address ) , " \n " ) ) ;
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
ASSERT ( threadData - > address ) ;
{
WTF::ParkingLot should stop using std::chrono because std::chrono::duration casts are prone to overflows
https://bugs.webkit.org/show_bug.cgi?id=152045
Reviewed by Andy Estes.
Source/JavaScriptCore:
Probably the nicest example of why this patch is a good idea is the change in
AtomicsObject.cpp.
* jit/ICStats.cpp:
(JSC::ICStats::ICStats):
* runtime/AtomicsObject.cpp:
(JSC::atomicsFuncWait):
Source/WebCore:
No new layout tests because no new behavior. The new WTF time classes have some unit tests
in TestWebKitAPI.
* fileapi/ThreadableBlobRegistry.cpp:
(WebCore::ThreadableBlobRegistry::blobSize):
* platform/MainThreadSharedTimer.h:
* platform/SharedTimer.h:
* platform/ThreadTimers.cpp:
(WebCore::ThreadTimers::updateSharedTimer):
* platform/cf/MainThreadSharedTimerCF.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/efl/MainThreadSharedTimerEfl.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/glib/MainThreadSharedTimerGLib.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/win/MainThreadSharedTimerWin.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* workers/WorkerRunLoop.cpp:
(WebCore::WorkerRunLoop::runInMode):
Source/WebKit2:
* Platform/IPC/Connection.cpp:
(IPC::Connection::SyncMessageState::wait):
(IPC::Connection::sendMessage):
(IPC::Connection::timeoutRespectingIgnoreTimeoutsForTesting):
(IPC::Connection::waitForMessage):
(IPC::Connection::sendSyncMessage):
(IPC::Connection::waitForSyncReply):
* Platform/IPC/Connection.h:
(IPC::Connection::sendSync):
(IPC::Connection::waitForAndDispatchImmediately):
* Platform/IPC/MessageSender.h:
(IPC::MessageSender::sendSync):
* UIProcess/ChildProcessProxy.h:
(WebKit::ChildProcessProxy::sendSync):
* UIProcess/Network/NetworkProcessProxy.cpp:
(WebKit::NetworkProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/Storage/StorageManager.cpp:
(WebKit::StorageManager::applicationWillTerminate):
* UIProcess/WebProcessProxy.cpp:
(WebKit::WebProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/WebResourceLoadStatisticsStore.cpp:
(WebKit::WebResourceLoadStatisticsStore::applicationWillTerminate):
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.h:
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.mm:
(-[WKOneShotDisplayLinkHandler displayLinkFired:]):
(WebKit::RemoteLayerTreeDrawingAreaProxy::commitLayerTree):
(WebKit::RemoteLayerTreeDrawingAreaProxy::didRefreshDisplay):
(WebKit::RemoteLayerTreeDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/TiledCoreAnimationDrawingAreaProxy.mm:
(WebKit::TiledCoreAnimationDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/WKImmediateActionController.mm:
(-[WKImmediateActionController immediateActionRecognizerWillBeginAnimation:]):
* UIProcess/mac/WebPageProxyMac.mm:
(WebKit::WebPageProxy::stringSelectionForPasteboard):
(WebKit::WebPageProxy::dataSelectionForPasteboard):
(WebKit::WebPageProxy::readSelectionFromPasteboard):
(WebKit::WebPageProxy::shouldDelayWindowOrderingForEvent):
(WebKit::WebPageProxy::acceptsFirstMouse):
* WebProcess/WebCoreSupport/WebChromeClient.cpp:
(WebKit::WebChromeClient::runBeforeUnloadConfirmPanel):
(WebKit::WebChromeClient::runJavaScriptAlert):
(WebKit::WebChromeClient::runJavaScriptConfirm):
(WebKit::WebChromeClient::runJavaScriptPrompt):
(WebKit::WebChromeClient::print):
(WebKit::WebChromeClient::exceededDatabaseQuota):
(WebKit::WebChromeClient::reachedApplicationCacheOriginQuota):
* WebProcess/WebCoreSupport/WebFrameLoaderClient.cpp:
(WebKit::WebFrameLoaderClient::dispatchDecidePolicyForResponse):
* WebProcess/WebPage/WebPage.cpp:
(WebKit::WebPage::postSynchronousMessageForTesting):
Source/WTF:
We used to use 'double' for all time measurements. Sometimes it was milliseconds,
sometimes it was seconds. Sometimes we measured a span of time, sometimes we spoke of time
since some epoch. When we spoke of time since epoch, we either used a monotonic clock or
a wall clock. The type - always 'double' - never told us what kind of time we had, even
though there were roughly six of them (sec interval, ms interval, sec since epoch on wall,
ms since epoch on wall, sec since epoch monotonic, ms since epoch monotonic).
At some point, we thought that it would be a good idea to replace these doubles with
std::chrono. But since replacing some things with std::chrono, we found it to be terribly
inconvenient:
- Outrageous API. I never want to say std::chrono::milliseconds(blah). I never want to say
std::chrono::steady_clock::timepoint. The syntax for duration_cast is ugly, and ideally
duration_cast would not even be a thing.
- No overflow protection. std::chrono uses integers by default and using anything else is
clumsy. But the integer math is done without regard for the rough edges of integer math,
so any cast between std::chrono types risks overflow. Any comparison risks overflow
because it may do conversions silently. We have even found bugs where some C++
implementations had more overflows than others, which ends up being a special kind of
hell. In many cases, the overflow also has nasal demons.
It's an error to represent time using integers. It would have been excusable back when
floating point math was not guaranteed to be supported on all platforms, but that would
have been a long time ago. Time is a continuous, infinite concept and it's a perfect fit
for floating point:
- Floating point preserves precision under multiplication in all but extreme cases, so
using floating point for time means that unit conversions are almost completely
lossless. This means that we don't have to think very hard about what units to use. In
this patch, we use seconds almost everywhere. We only convert at boundaries, like an API
boundary that wants something other than seconds.
- Floating point makes it easy to reason about infinity, which is something that time code
wants to do a lot. Example: when would you like to timeout? Infinity please! This is the
most elegant way of having an API support both a timeout variant and a no-timeout
variant.
- Floating point does well-understood things when math goes wrong, and these things are
pretty well optimized to match what a mathematician would do when computing with real
numbers represented using scientific notation with a finite number of significant
digits. This means that time math under floating point looks like normal math. On the
other hand, std::chrono time math looks like garbage because you have to always check
for multiple possible UB corners whenever you touch large integers. Integers that
represent time are very likely to be large and you don't have to do much to overflow
them. At this time, based on the number of bugs we have already seen due to chrono
overflows, I am not certain that we even understand what are all of the corner cases
that we should even check for.
This patch introduces a new set of timekeeping classes that are all based on double, and
all internally use seconds. These classes support algebraic typing. The classes are:
- Seconds: this is for measuring a duration.
- WallTime: time since epoch according to a wall clock (aka real time clock).
- MonotonicTime: time since epoch according to a monotonic clock.
- ClockType: enum that says either Wall or Monotonic.
- TimeWithDynamicClockType: a tuple of double and ClockType, which represents either a
wall time or a monotonic time.
All of these classes behave like C++ values and are cheap to copy around since they are
very nearly POD. This supports comprehensive conversions between the various time types.
Most of this is by way of algebra. Here are just some of the rules we recognize:
WallTime = WallTime + Seconds
Seconds = WallTime - WallTime
MonotonicTime = MonotonicTime + Seconds
etc...
We support negative, infinite, and NaN times because math.
We support conversions between MonotonicTime and WallTime, like:
WallTime wt = mt.approximateWallTime()
This is called this "approximate" because the only way to do it is to get the current time
on both clocks and convert relative to that.
Many of our APIs would be happy using whatever notion of time the user wanted to use. For
those APIs, which includes Condition and ParkingLot, we have TimeWithDynamicClockType. You
can automatically convert WallTime or MonotonicTime to TimeWithDynamicClockType. This
means that if you use a WallTime with Condition::waitUntil, then Condition's internal
logic for when it should wake up makes its decision based on the current WallTime - but if
you use MonotonicTime then waitUntil will make its decision based on current
MonotonicTime. This is a greater level of flexibility than chrono allowed, since chrono
did not have the concept of a dynamic clock type.
This patch does not include conversions between std::chrono and these new time classes,
because past experience shows that we're quite bad at getting conversions between
std::chrono and anything else right. Also, I didn't need such conversion code because this
patch only converts code that transitively touches ParkingLot and Condition. It was easy
to get all of that code onto the new time classes.
* WTF.xcodeproj/project.pbxproj:
* wtf/AutomaticThread.cpp:
(WTF::AutomaticThread::start):
* wtf/CMakeLists.txt:
* wtf/ClockType.cpp: Added.
(WTF::printInternal):
* wtf/ClockType.h: Added.
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::waitFor):
(WTF::ConditionBase::wait):
(WTF::ConditionBase::waitUntilWallClockSeconds): Deleted.
(WTF::ConditionBase::waitUntilMonotonicClockSeconds): Deleted.
(WTF::ConditionBase::waitForSeconds): Deleted.
(WTF::ConditionBase::waitForSecondsImpl): Deleted.
(WTF::ConditionBase::waitForImpl): Deleted.
(WTF::ConditionBase::absoluteFromRelative): Deleted.
* wtf/CrossThreadQueue.h:
(WTF::CrossThreadQueue<DataType>::waitForMessage):
* wtf/CurrentTime.cpp:
(WTF::sleep):
* wtf/MessageQueue.h:
(WTF::MessageQueue::infiniteTime): Deleted.
* wtf/MonotonicTime.cpp: Added.
(WTF::MonotonicTime::now):
(WTF::MonotonicTime::approximateWallTime):
(WTF::MonotonicTime::dump):
(WTF::MonotonicTime::sleep):
* wtf/MonotonicTime.h: Added.
(WTF::MonotonicTime::MonotonicTime):
(WTF::MonotonicTime::fromRawDouble):
(WTF::MonotonicTime::infinity):
(WTF::MonotonicTime::secondsSinceEpoch):
(WTF::MonotonicTime::approximateMonotonicTime):
(WTF::MonotonicTime::operator bool):
(WTF::MonotonicTime::operator+):
(WTF::MonotonicTime::operator-):
(WTF::MonotonicTime::operator+=):
(WTF::MonotonicTime::operator-=):
(WTF::MonotonicTime::operator==):
(WTF::MonotonicTime::operator!=):
(WTF::MonotonicTime::operator<):
(WTF::MonotonicTime::operator>):
(WTF::MonotonicTime::operator<=):
(WTF::MonotonicTime::operator>=):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkCount):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
* wtf/Seconds.cpp: Added.
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::dump):
(WTF::Seconds::sleep):
* wtf/Seconds.h: Added.
(WTF::Seconds::Seconds):
(WTF::Seconds::value):
(WTF::Seconds::seconds):
(WTF::Seconds::milliseconds):
(WTF::Seconds::microseconds):
(WTF::Seconds::nanoseconds):
(WTF::Seconds::fromMilliseconds):
(WTF::Seconds::fromMicroseconds):
(WTF::Seconds::fromNanoseconds):
(WTF::Seconds::infinity):
(WTF::Seconds::operator bool):
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::operator*):
(WTF::Seconds::operator/):
(WTF::Seconds::operator+=):
(WTF::Seconds::operator-=):
(WTF::Seconds::operator*=):
(WTF::Seconds::operator/=):
(WTF::Seconds::operator==):
(WTF::Seconds::operator!=):
(WTF::Seconds::operator<):
(WTF::Seconds::operator>):
(WTF::Seconds::operator<=):
(WTF::Seconds::operator>=):
* wtf/TimeWithDynamicClockType.cpp: Added.
(WTF::TimeWithDynamicClockType::now):
(WTF::TimeWithDynamicClockType::nowWithSameClock):
(WTF::TimeWithDynamicClockType::wallTime):
(WTF::TimeWithDynamicClockType::monotonicTime):
(WTF::TimeWithDynamicClockType::approximateWallTime):
(WTF::TimeWithDynamicClockType::approximateMonotonicTime):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator<):
(WTF::TimeWithDynamicClockType::operator>):
(WTF::TimeWithDynamicClockType::operator<=):
(WTF::TimeWithDynamicClockType::operator>=):
(WTF::TimeWithDynamicClockType::dump):
(WTF::TimeWithDynamicClockType::sleep):
* wtf/TimeWithDynamicClockType.h: Added.
(WTF::TimeWithDynamicClockType::TimeWithDynamicClockType):
(WTF::TimeWithDynamicClockType::fromRawDouble):
(WTF::TimeWithDynamicClockType::secondsSinceEpoch):
(WTF::TimeWithDynamicClockType::clockType):
(WTF::TimeWithDynamicClockType::withSameClockAndRawDouble):
(WTF::TimeWithDynamicClockType::operator bool):
(WTF::TimeWithDynamicClockType::operator+):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator+=):
(WTF::TimeWithDynamicClockType::operator-=):
(WTF::TimeWithDynamicClockType::operator==):
(WTF::TimeWithDynamicClockType::operator!=):
* wtf/WallTime.cpp: Added.
(WTF::WallTime::now):
(WTF::WallTime::approximateMonotonicTime):
(WTF::WallTime::dump):
(WTF::WallTime::sleep):
* wtf/WallTime.h: Added.
(WTF::WallTime::WallTime):
(WTF::WallTime::fromRawDouble):
(WTF::WallTime::infinity):
(WTF::WallTime::secondsSinceEpoch):
(WTF::WallTime::approximateWallTime):
(WTF::WallTime::operator bool):
(WTF::WallTime::operator+):
(WTF::WallTime::operator-):
(WTF::WallTime::operator+=):
(WTF::WallTime::operator-=):
(WTF::WallTime::operator==):
(WTF::WallTime::operator!=):
(WTF::WallTime::operator<):
(WTF::WallTime::operator>):
(WTF::WallTime::operator<=):
(WTF::WallTime::operator>=):
* wtf/threads/BinarySemaphore.cpp:
(WTF::BinarySemaphore::wait):
* wtf/threads/BinarySemaphore.h:
Tools:
* TestWebKitAPI/CMakeLists.txt:
* TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:
* TestWebKitAPI/Tests/WTF/Condition.cpp:
(TestWebKitAPI::TEST):
* TestWebKitAPI/Tests/WTF/SynchronizedFixedQueue.cpp:
(TestWebKitAPI::ToUpperConverter::stopProducing):
(TestWebKitAPI::ToUpperConverter::stopConsuming):
* TestWebKitAPI/Tests/WTF/Time.cpp: Added.
(WTF::operator<<):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/182152@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208415 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-05 03:02:39 +00:00
MutexLocker locker ( threadData - > parkingLock ) ;
WTF::Lock should not suffer from the thundering herd
https://bugs.webkit.org/show_bug.cgi?id=147947
Reviewed by Geoffrey Garen.
Source/WTF:
This changes Lock::unlockSlow() to use unparkOne() instead of unparkAll(). The problem with
doing this is that it's not obvious after calling unparkOne() if there are any other threads
that are still parked on the lock's queue. If we assume that there are and leave the
hasParkedBit set, then future calls to unlock() will take the slow path. We don't want that
if there aren't actually any threads parked. On the other hand, if we assume that there
aren't any threads parked and clear the hasParkedBit, then if there actually were some
threads parked, then they may never be awoken since future calls to unlock() won't take slow
path and so won't call unparkOne(). In other words, we need a way to be very precise about
when we clear the hasParkedBit and we need to do it in a race-free way: it can't be the case
that we clear the bit just as some thread gets parked on the queue.
A similar problem arises in futexes, and one of the solutions is to have a thread that
acquires a lock after parking sets the hasParkedBit. This is what Rusty Russel's usersem
does. It's a subtle algorithm. Also, it means that if a thread barges in before the unparked
thread runs, then that barging thread will not know that there are threads parked. This
could increase the severity of barging.
Since ParkingLot is a user-level API, we don't have to worry about the kernel-user security
issues and so we can expose callbacks while ParkingLot is holding its internal locks. This
change does exactly that for unparkOne(). The new variant of unparkOne() will call a user
function while the queue from which we are unparking is locked. The callback is told basic
stats about the queue: did we unpark a thread this time, and could there be more threads to
unpark in the future. The callback runs while it's impossible for the queue state to change,
since the ParkingLot's internal locks for the queue is held. This means that
Lock::unlockSlow() can either clear, or leave, the hasParkedBit while releasing the lock
inside the callback from unparkOne(). This takes care of the thundering herd problem while
also reducing the greed that arises from barging threads.
This required some careful reworking of the ParkingLot algorithm. The first thing I noticed
was that the ThreadData::shouldPark flag was useless, since it's set exactly when
ThreadData::address is non-null. Then I had to make sure that dequeue() could lazily create
both hashtables and buckets, since the "callback is called while queue is locked" invariant
requires that we didn't exit early due to the hashtable or bucket not being present. Note
that all of this is done in such a way that the old unparkOne() and unparkAll() don't have
to create any buckets, though they now may create the hashtable. We don't care as much about
the hashtable being created by unpark since it's just such an unlikely scenario and it would
only happen once.
This change reduces the kernel CPU usage of WTF::Lock for the long critical section test by
about 8x and makes it always perform as well as WTF::WordLock and WTF::Mutex for that
benchmark.
* benchmarks/LockSpeedTest.cpp:
* wtf/Lock.cpp:
(WTF::LockBase::unlockSlow):
* wtf/Lock.h:
(WTF::LockBase::isLocked):
(WTF::LockBase::isFullyReset):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkAll):
* wtf/ParkingLot.h:
* wtf/WordLock.h:
(WTF::WordLock::isLocked):
(WTF::WordLock::isFullyReset):
Tools:
Add testing that checks that locks return to a pristine state after contention is over.
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::LockInspector::isFullyReset):
(TestWebKitAPI::runLockTest):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/166072@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@188374 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-08-13 03:51:25 +00:00
threadData - > address = nullptr ;
}
WTF::ParkingLot should stop using std::chrono because std::chrono::duration casts are prone to overflows
https://bugs.webkit.org/show_bug.cgi?id=152045
Reviewed by Andy Estes.
Source/JavaScriptCore:
Probably the nicest example of why this patch is a good idea is the change in
AtomicsObject.cpp.
* jit/ICStats.cpp:
(JSC::ICStats::ICStats):
* runtime/AtomicsObject.cpp:
(JSC::atomicsFuncWait):
Source/WebCore:
No new layout tests because no new behavior. The new WTF time classes have some unit tests
in TestWebKitAPI.
* fileapi/ThreadableBlobRegistry.cpp:
(WebCore::ThreadableBlobRegistry::blobSize):
* platform/MainThreadSharedTimer.h:
* platform/SharedTimer.h:
* platform/ThreadTimers.cpp:
(WebCore::ThreadTimers::updateSharedTimer):
* platform/cf/MainThreadSharedTimerCF.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/efl/MainThreadSharedTimerEfl.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/glib/MainThreadSharedTimerGLib.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* platform/win/MainThreadSharedTimerWin.cpp:
(WebCore::MainThreadSharedTimer::setFireInterval):
* workers/WorkerRunLoop.cpp:
(WebCore::WorkerRunLoop::runInMode):
Source/WebKit2:
* Platform/IPC/Connection.cpp:
(IPC::Connection::SyncMessageState::wait):
(IPC::Connection::sendMessage):
(IPC::Connection::timeoutRespectingIgnoreTimeoutsForTesting):
(IPC::Connection::waitForMessage):
(IPC::Connection::sendSyncMessage):
(IPC::Connection::waitForSyncReply):
* Platform/IPC/Connection.h:
(IPC::Connection::sendSync):
(IPC::Connection::waitForAndDispatchImmediately):
* Platform/IPC/MessageSender.h:
(IPC::MessageSender::sendSync):
* UIProcess/ChildProcessProxy.h:
(WebKit::ChildProcessProxy::sendSync):
* UIProcess/Network/NetworkProcessProxy.cpp:
(WebKit::NetworkProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/Storage/StorageManager.cpp:
(WebKit::StorageManager::applicationWillTerminate):
* UIProcess/WebProcessProxy.cpp:
(WebKit::WebProcessProxy::sendProcessWillSuspendImminently):
* UIProcess/WebResourceLoadStatisticsStore.cpp:
(WebKit::WebResourceLoadStatisticsStore::applicationWillTerminate):
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.h:
* UIProcess/mac/RemoteLayerTreeDrawingAreaProxy.mm:
(-[WKOneShotDisplayLinkHandler displayLinkFired:]):
(WebKit::RemoteLayerTreeDrawingAreaProxy::commitLayerTree):
(WebKit::RemoteLayerTreeDrawingAreaProxy::didRefreshDisplay):
(WebKit::RemoteLayerTreeDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/TiledCoreAnimationDrawingAreaProxy.mm:
(WebKit::TiledCoreAnimationDrawingAreaProxy::waitForDidUpdateActivityState):
* UIProcess/mac/WKImmediateActionController.mm:
(-[WKImmediateActionController immediateActionRecognizerWillBeginAnimation:]):
* UIProcess/mac/WebPageProxyMac.mm:
(WebKit::WebPageProxy::stringSelectionForPasteboard):
(WebKit::WebPageProxy::dataSelectionForPasteboard):
(WebKit::WebPageProxy::readSelectionFromPasteboard):
(WebKit::WebPageProxy::shouldDelayWindowOrderingForEvent):
(WebKit::WebPageProxy::acceptsFirstMouse):
* WebProcess/WebCoreSupport/WebChromeClient.cpp:
(WebKit::WebChromeClient::runBeforeUnloadConfirmPanel):
(WebKit::WebChromeClient::runJavaScriptAlert):
(WebKit::WebChromeClient::runJavaScriptConfirm):
(WebKit::WebChromeClient::runJavaScriptPrompt):
(WebKit::WebChromeClient::print):
(WebKit::WebChromeClient::exceededDatabaseQuota):
(WebKit::WebChromeClient::reachedApplicationCacheOriginQuota):
* WebProcess/WebCoreSupport/WebFrameLoaderClient.cpp:
(WebKit::WebFrameLoaderClient::dispatchDecidePolicyForResponse):
* WebProcess/WebPage/WebPage.cpp:
(WebKit::WebPage::postSynchronousMessageForTesting):
Source/WTF:
We used to use 'double' for all time measurements. Sometimes it was milliseconds,
sometimes it was seconds. Sometimes we measured a span of time, sometimes we spoke of time
since some epoch. When we spoke of time since epoch, we either used a monotonic clock or
a wall clock. The type - always 'double' - never told us what kind of time we had, even
though there were roughly six of them (sec interval, ms interval, sec since epoch on wall,
ms since epoch on wall, sec since epoch monotonic, ms since epoch monotonic).
At some point, we thought that it would be a good idea to replace these doubles with
std::chrono. But since replacing some things with std::chrono, we found it to be terribly
inconvenient:
- Outrageous API. I never want to say std::chrono::milliseconds(blah). I never want to say
std::chrono::steady_clock::timepoint. The syntax for duration_cast is ugly, and ideally
duration_cast would not even be a thing.
- No overflow protection. std::chrono uses integers by default and using anything else is
clumsy. But the integer math is done without regard for the rough edges of integer math,
so any cast between std::chrono types risks overflow. Any comparison risks overflow
because it may do conversions silently. We have even found bugs where some C++
implementations had more overflows than others, which ends up being a special kind of
hell. In many cases, the overflow also has nasal demons.
It's an error to represent time using integers. It would have been excusable back when
floating point math was not guaranteed to be supported on all platforms, but that would
have been a long time ago. Time is a continuous, infinite concept and it's a perfect fit
for floating point:
- Floating point preserves precision under multiplication in all but extreme cases, so
using floating point for time means that unit conversions are almost completely
lossless. This means that we don't have to think very hard about what units to use. In
this patch, we use seconds almost everywhere. We only convert at boundaries, like an API
boundary that wants something other than seconds.
- Floating point makes it easy to reason about infinity, which is something that time code
wants to do a lot. Example: when would you like to timeout? Infinity please! This is the
most elegant way of having an API support both a timeout variant and a no-timeout
variant.
- Floating point does well-understood things when math goes wrong, and these things are
pretty well optimized to match what a mathematician would do when computing with real
numbers represented using scientific notation with a finite number of significant
digits. This means that time math under floating point looks like normal math. On the
other hand, std::chrono time math looks like garbage because you have to always check
for multiple possible UB corners whenever you touch large integers. Integers that
represent time are very likely to be large and you don't have to do much to overflow
them. At this time, based on the number of bugs we have already seen due to chrono
overflows, I am not certain that we even understand what are all of the corner cases
that we should even check for.
This patch introduces a new set of timekeeping classes that are all based on double, and
all internally use seconds. These classes support algebraic typing. The classes are:
- Seconds: this is for measuring a duration.
- WallTime: time since epoch according to a wall clock (aka real time clock).
- MonotonicTime: time since epoch according to a monotonic clock.
- ClockType: enum that says either Wall or Monotonic.
- TimeWithDynamicClockType: a tuple of double and ClockType, which represents either a
wall time or a monotonic time.
All of these classes behave like C++ values and are cheap to copy around since they are
very nearly POD. This supports comprehensive conversions between the various time types.
Most of this is by way of algebra. Here are just some of the rules we recognize:
WallTime = WallTime + Seconds
Seconds = WallTime - WallTime
MonotonicTime = MonotonicTime + Seconds
etc...
We support negative, infinite, and NaN times because math.
We support conversions between MonotonicTime and WallTime, like:
WallTime wt = mt.approximateWallTime()
This is called this "approximate" because the only way to do it is to get the current time
on both clocks and convert relative to that.
Many of our APIs would be happy using whatever notion of time the user wanted to use. For
those APIs, which includes Condition and ParkingLot, we have TimeWithDynamicClockType. You
can automatically convert WallTime or MonotonicTime to TimeWithDynamicClockType. This
means that if you use a WallTime with Condition::waitUntil, then Condition's internal
logic for when it should wake up makes its decision based on the current WallTime - but if
you use MonotonicTime then waitUntil will make its decision based on current
MonotonicTime. This is a greater level of flexibility than chrono allowed, since chrono
did not have the concept of a dynamic clock type.
This patch does not include conversions between std::chrono and these new time classes,
because past experience shows that we're quite bad at getting conversions between
std::chrono and anything else right. Also, I didn't need such conversion code because this
patch only converts code that transitively touches ParkingLot and Condition. It was easy
to get all of that code onto the new time classes.
* WTF.xcodeproj/project.pbxproj:
* wtf/AutomaticThread.cpp:
(WTF::AutomaticThread::start):
* wtf/CMakeLists.txt:
* wtf/ClockType.cpp: Added.
(WTF::printInternal):
* wtf/ClockType.h: Added.
* wtf/Condition.h:
(WTF::ConditionBase::waitUntil):
(WTF::ConditionBase::waitFor):
(WTF::ConditionBase::wait):
(WTF::ConditionBase::waitUntilWallClockSeconds): Deleted.
(WTF::ConditionBase::waitUntilMonotonicClockSeconds): Deleted.
(WTF::ConditionBase::waitForSeconds): Deleted.
(WTF::ConditionBase::waitForSecondsImpl): Deleted.
(WTF::ConditionBase::waitForImpl): Deleted.
(WTF::ConditionBase::absoluteFromRelative): Deleted.
* wtf/CrossThreadQueue.h:
(WTF::CrossThreadQueue<DataType>::waitForMessage):
* wtf/CurrentTime.cpp:
(WTF::sleep):
* wtf/MessageQueue.h:
(WTF::MessageQueue::infiniteTime): Deleted.
* wtf/MonotonicTime.cpp: Added.
(WTF::MonotonicTime::now):
(WTF::MonotonicTime::approximateWallTime):
(WTF::MonotonicTime::dump):
(WTF::MonotonicTime::sleep):
* wtf/MonotonicTime.h: Added.
(WTF::MonotonicTime::MonotonicTime):
(WTF::MonotonicTime::fromRawDouble):
(WTF::MonotonicTime::infinity):
(WTF::MonotonicTime::secondsSinceEpoch):
(WTF::MonotonicTime::approximateMonotonicTime):
(WTF::MonotonicTime::operator bool):
(WTF::MonotonicTime::operator+):
(WTF::MonotonicTime::operator-):
(WTF::MonotonicTime::operator+=):
(WTF::MonotonicTime::operator-=):
(WTF::MonotonicTime::operator==):
(WTF::MonotonicTime::operator!=):
(WTF::MonotonicTime::operator<):
(WTF::MonotonicTime::operator>):
(WTF::MonotonicTime::operator<=):
(WTF::MonotonicTime::operator>=):
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::parkConditionallyImpl):
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkOneImpl):
(WTF::ParkingLot::unparkCount):
* wtf/ParkingLot.h:
(WTF::ParkingLot::parkConditionally):
(WTF::ParkingLot::compareAndPark):
* wtf/Seconds.cpp: Added.
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::dump):
(WTF::Seconds::sleep):
* wtf/Seconds.h: Added.
(WTF::Seconds::Seconds):
(WTF::Seconds::value):
(WTF::Seconds::seconds):
(WTF::Seconds::milliseconds):
(WTF::Seconds::microseconds):
(WTF::Seconds::nanoseconds):
(WTF::Seconds::fromMilliseconds):
(WTF::Seconds::fromMicroseconds):
(WTF::Seconds::fromNanoseconds):
(WTF::Seconds::infinity):
(WTF::Seconds::operator bool):
(WTF::Seconds::operator+):
(WTF::Seconds::operator-):
(WTF::Seconds::operator*):
(WTF::Seconds::operator/):
(WTF::Seconds::operator+=):
(WTF::Seconds::operator-=):
(WTF::Seconds::operator*=):
(WTF::Seconds::operator/=):
(WTF::Seconds::operator==):
(WTF::Seconds::operator!=):
(WTF::Seconds::operator<):
(WTF::Seconds::operator>):
(WTF::Seconds::operator<=):
(WTF::Seconds::operator>=):
* wtf/TimeWithDynamicClockType.cpp: Added.
(WTF::TimeWithDynamicClockType::now):
(WTF::TimeWithDynamicClockType::nowWithSameClock):
(WTF::TimeWithDynamicClockType::wallTime):
(WTF::TimeWithDynamicClockType::monotonicTime):
(WTF::TimeWithDynamicClockType::approximateWallTime):
(WTF::TimeWithDynamicClockType::approximateMonotonicTime):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator<):
(WTF::TimeWithDynamicClockType::operator>):
(WTF::TimeWithDynamicClockType::operator<=):
(WTF::TimeWithDynamicClockType::operator>=):
(WTF::TimeWithDynamicClockType::dump):
(WTF::TimeWithDynamicClockType::sleep):
* wtf/TimeWithDynamicClockType.h: Added.
(WTF::TimeWithDynamicClockType::TimeWithDynamicClockType):
(WTF::TimeWithDynamicClockType::fromRawDouble):
(WTF::TimeWithDynamicClockType::secondsSinceEpoch):
(WTF::TimeWithDynamicClockType::clockType):
(WTF::TimeWithDynamicClockType::withSameClockAndRawDouble):
(WTF::TimeWithDynamicClockType::operator bool):
(WTF::TimeWithDynamicClockType::operator+):
(WTF::TimeWithDynamicClockType::operator-):
(WTF::TimeWithDynamicClockType::operator+=):
(WTF::TimeWithDynamicClockType::operator-=):
(WTF::TimeWithDynamicClockType::operator==):
(WTF::TimeWithDynamicClockType::operator!=):
* wtf/WallTime.cpp: Added.
(WTF::WallTime::now):
(WTF::WallTime::approximateMonotonicTime):
(WTF::WallTime::dump):
(WTF::WallTime::sleep):
* wtf/WallTime.h: Added.
(WTF::WallTime::WallTime):
(WTF::WallTime::fromRawDouble):
(WTF::WallTime::infinity):
(WTF::WallTime::secondsSinceEpoch):
(WTF::WallTime::approximateWallTime):
(WTF::WallTime::operator bool):
(WTF::WallTime::operator+):
(WTF::WallTime::operator-):
(WTF::WallTime::operator+=):
(WTF::WallTime::operator-=):
(WTF::WallTime::operator==):
(WTF::WallTime::operator!=):
(WTF::WallTime::operator<):
(WTF::WallTime::operator>):
(WTF::WallTime::operator<=):
(WTF::WallTime::operator>=):
* wtf/threads/BinarySemaphore.cpp:
(WTF::BinarySemaphore::wait):
* wtf/threads/BinarySemaphore.h:
Tools:
* TestWebKitAPI/CMakeLists.txt:
* TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:
* TestWebKitAPI/Tests/WTF/Condition.cpp:
(TestWebKitAPI::TEST):
* TestWebKitAPI/Tests/WTF/SynchronizedFixedQueue.cpp:
(TestWebKitAPI::ToUpperConverter::stopProducing):
(TestWebKitAPI::ToUpperConverter::stopConsuming):
* TestWebKitAPI/Tests/WTF/Time.cpp: Added.
(WTF::operator<<):
(TestWebKitAPI::TEST):
Canonical link: https://commits.webkit.org/182152@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208415 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-05 03:02:39 +00:00
threadData - > parkingCondition . signal ( ) ;
2015-08-11 19:51:35 +00:00
}
if ( verbose )
2017-12-04 06:13:05 +00:00
dataLog ( toString ( Thread : : current ( ) , " : done unparking. \n " ) ) ;
JSC should support SharedArrayBuffer
https://bugs.webkit.org/show_bug.cgi?id=163986
Reviewed by Keith Miller.
JSTests:
This adds our own test for the various corner cases of SharedArrayBuffer. This test is meant to
check all of the things that don't require concurrency.
* stress/SharedArrayBuffer.js: Added.
(checkAtomics):
(shouldFail):
(Symbol):
(runAtomic):
Source/JavaScriptCore:
This implements https://tc39.github.io/ecmascript_sharedmem/shmem.html.
There is now a new SharedArrayBuffer type. In the JS runtime, which includes typed array
types, the SharedArrayBuffer is a drop-in replacement for ArrayBuffer, even though they are
distinct types (new SharedArrayBuffer() instanceof ArrayBuffer == false and vice versa). The
DOM will not recognize SharedArrayBuffer, or any typed array that wraps it, to ensure safety.
This matches what other browsers intend to do, see
https://github.com/tc39/ecmascript_sharedmem/issues/38. API is provided for the DOM to opt
into SharedArrayBuffer. One notable place is postMessage, which will share the
SharedArrayBuffer's underlying data storage with other workers. This creates a pool of shared
memory that the workers can use to talk to each other.
There is also an Atomics object in global scope, which exposes sequentially consistent atomic
operations: add, and, compareExchange, exchange, load, or, store, sub, and xor. Additionally
it exposes a Atomics.isLockFree utility, which takes a byte amount and returns true or false.
Also there is Atomics.wake/wait, which neatly map to ParkingLot.
Accesses to typed arrays that wrap SharedArrayBuffer are optimized by JSC the same way as
always. I believe that DFG and B3 already obey the following memory model, which I believe is
a bit weaker than Cambridge and a bit stronger than what is being proposed for
SharedArrayBuffer. To predict a program's behavior under the B3 memory model, imagine the
space of all possible programs that would result from running an optimizer that adversarially
follows B3's transformation rules. B3 transformations are correct if the newly created
program is equivalent to the old one, assuming that any opaque effect in IR (like the reads
and writes of a patchpoint/call/fence) could perform any load/store that satisfies the
B3::Effects summary. Opaque effects are a way of describing an infinite set of programs: any
program that only does the effects summarized in B3::Effects belongs to the set. For example,
this prevents motion of operations across fences since fences are summarized as opaque
effects that could read or write memory. This rule alone is not enough, because it leaves the
door open for turning an atomic operation (like a load) into a non-atomic one (like a load
followed by a store of the same value back to the same location or multiple loads). This is
not an optimization that either our compiler or the CPU would want to do. One way to think of
what exactly is forbidden is that B3 transformations that mess with memory accesses can only
reorder them or remove them. This means that for any execution of the untransformed program,
the corresponding execution of the transformed program (i.e. with the same input arguments
and the same programs filled in for the opaque effects) must have the same loads and stores,
with some removed and some reordered. This is a fairly simple mental model that B3 and DFG
already follow and it's based on existing abstractions for the infinite set of programs
inside an opaque effect (DFG's AbstractHeaps and B3's Effects).
This patch makes all atomics operations intrinsic, but the DFG doesn't know about any of them
yet. That's covered by bug 164108.
This ought to be perf-neutral, but I am still running tests to confirm this. I'm also still
writing new tests to cover all of the Atomics functionality and the behavior of SAB objects.
* API/JSTypedArray.cpp:
(JSObjectGetTypedArrayBytesPtr):
(JSObjectGetTypedArrayBuffer):
(JSObjectMakeArrayBufferWithBytesNoCopy):
* API/tests/CompareAndSwapTest.cpp:
(Bitmap::concurrentTestAndSet):
* CMakeLists.txt:
* JavaScriptCore.xcodeproj/project.pbxproj:
* dfg/DFGDesiredWatchpoints.cpp:
(JSC::DFG::ArrayBufferViewWatchpointAdaptor::add):
* heap/Heap.cpp:
(JSC::Heap::reportExtraMemoryVisited):
(JSC::Heap::reportExternalMemoryVisited):
* jsc.cpp:
(functionTransferArrayBuffer):
* runtime/ArrayBuffer.cpp:
(JSC::SharedArrayBufferContents::SharedArrayBufferContents):
(JSC::SharedArrayBufferContents::~SharedArrayBufferContents):
(JSC::ArrayBufferContents::ArrayBufferContents):
(JSC::ArrayBufferContents::operator=):
(JSC::ArrayBufferContents::~ArrayBufferContents):
(JSC::ArrayBufferContents::clear):
(JSC::ArrayBufferContents::destroy):
(JSC::ArrayBufferContents::reset):
(JSC::ArrayBufferContents::tryAllocate):
(JSC::ArrayBufferContents::makeShared):
(JSC::ArrayBufferContents::transferTo):
(JSC::ArrayBufferContents::copyTo):
(JSC::ArrayBufferContents::shareWith):
(JSC::ArrayBuffer::create):
(JSC::ArrayBuffer::createAdopted):
(JSC::ArrayBuffer::createFromBytes):
(JSC::ArrayBuffer::tryCreate):
(JSC::ArrayBuffer::createUninitialized):
(JSC::ArrayBuffer::tryCreateUninitialized):
(JSC::ArrayBuffer::createInternal):
(JSC::ArrayBuffer::ArrayBuffer):
(JSC::ArrayBuffer::slice):
(JSC::ArrayBuffer::sliceImpl):
(JSC::ArrayBuffer::makeShared):
(JSC::ArrayBuffer::setSharingMode):
(JSC::ArrayBuffer::transferTo):
(JSC::ArrayBuffer::transfer): Deleted.
* runtime/ArrayBuffer.h:
(JSC::arrayBufferSharingModeName):
(JSC::SharedArrayBufferContents::data):
(JSC::ArrayBufferContents::data):
(JSC::ArrayBufferContents::sizeInBytes):
(JSC::ArrayBufferContents::isShared):
(JSC::ArrayBuffer::sharingMode):
(JSC::ArrayBuffer::isShared):
(JSC::ArrayBuffer::gcSizeEstimateInBytes):
(JSC::arrayBufferDestructorNull): Deleted.
(JSC::arrayBufferDestructorDefault): Deleted.
(JSC::ArrayBufferContents::ArrayBufferContents): Deleted.
(JSC::ArrayBufferContents::transfer): Deleted.
(JSC::ArrayBufferContents::copyTo): Deleted.
(JSC::ArrayBuffer::create): Deleted.
(JSC::ArrayBuffer::createAdopted): Deleted.
(JSC::ArrayBuffer::createFromBytes): Deleted.
(JSC::ArrayBuffer::tryCreate): Deleted.
(JSC::ArrayBuffer::createUninitialized): Deleted.
(JSC::ArrayBuffer::tryCreateUninitialized): Deleted.
(JSC::ArrayBuffer::createInternal): Deleted.
(JSC::ArrayBuffer::ArrayBuffer): Deleted.
(JSC::ArrayBuffer::slice): Deleted.
(JSC::ArrayBuffer::sliceImpl): Deleted.
(JSC::ArrayBufferContents::tryAllocate): Deleted.
(JSC::ArrayBufferContents::~ArrayBufferContents): Deleted.
* runtime/ArrayBufferSharingMode.h: Added.
* runtime/ArrayBufferView.h:
(JSC::ArrayBufferView::possiblySharedBuffer):
(JSC::ArrayBufferView::unsharedBuffer):
(JSC::ArrayBufferView::isShared):
(JSC::ArrayBufferView::buffer): Deleted.
* runtime/AtomicsObject.cpp: Added.
(JSC::AtomicsObject::AtomicsObject):
(JSC::AtomicsObject::create):
(JSC::AtomicsObject::createStructure):
(JSC::AtomicsObject::finishCreation):
(JSC::atomicsFuncAdd):
(JSC::atomicsFuncAnd):
(JSC::atomicsFuncCompareExchange):
(JSC::atomicsFuncExchange):
(JSC::atomicsFuncIsLockFree):
(JSC::atomicsFuncLoad):
(JSC::atomicsFuncOr):
(JSC::atomicsFuncStore):
(JSC::atomicsFuncSub):
(JSC::atomicsFuncWait):
(JSC::atomicsFuncWake):
(JSC::atomicsFuncXor):
* runtime/AtomicsObject.h: Added.
* runtime/CommonIdentifiers.h:
* runtime/DataView.cpp:
(JSC::DataView::wrap):
* runtime/GenericTypedArrayViewInlines.h:
(JSC::GenericTypedArrayView<Adaptor>::subarray):
* runtime/Intrinsic.h:
* runtime/JSArrayBuffer.cpp:
(JSC::JSArrayBuffer::finishCreation):
(JSC::JSArrayBuffer::isShared):
(JSC::JSArrayBuffer::sharingMode):
* runtime/JSArrayBuffer.h:
(JSC::toPossiblySharedArrayBuffer):
(JSC::toUnsharedArrayBuffer):
(JSC::JSArrayBuffer::toWrapped):
(JSC::toArrayBuffer): Deleted.
* runtime/JSArrayBufferConstructor.cpp:
(JSC::JSArrayBufferConstructor::JSArrayBufferConstructor):
(JSC::JSArrayBufferConstructor::finishCreation):
(JSC::JSArrayBufferConstructor::create):
(JSC::constructArrayBuffer):
* runtime/JSArrayBufferConstructor.h:
(JSC::JSArrayBufferConstructor::sharingMode):
* runtime/JSArrayBufferPrototype.cpp:
(JSC::arrayBufferProtoFuncSlice):
(JSC::JSArrayBufferPrototype::JSArrayBufferPrototype):
(JSC::JSArrayBufferPrototype::finishCreation):
(JSC::JSArrayBufferPrototype::create):
* runtime/JSArrayBufferPrototype.h:
* runtime/JSArrayBufferView.cpp:
(JSC::JSArrayBufferView::finishCreation):
(JSC::JSArrayBufferView::visitChildren):
(JSC::JSArrayBufferView::unsharedBuffer):
(JSC::JSArrayBufferView::unsharedJSBuffer):
(JSC::JSArrayBufferView::possiblySharedJSBuffer):
(JSC::JSArrayBufferView::neuter):
(JSC::JSArrayBufferView::toWrapped): Deleted.
* runtime/JSArrayBufferView.h:
(JSC::JSArrayBufferView::jsBuffer): Deleted.
* runtime/JSArrayBufferViewInlines.h:
(JSC::JSArrayBufferView::isShared):
(JSC::JSArrayBufferView::possiblySharedBuffer):
(JSC::JSArrayBufferView::possiblySharedImpl):
(JSC::JSArrayBufferView::unsharedImpl):
(JSC::JSArrayBufferView::byteOffset):
(JSC::JSArrayBufferView::toWrapped):
(JSC::JSArrayBufferView::buffer): Deleted.
(JSC::JSArrayBufferView::impl): Deleted.
(JSC::JSArrayBufferView::neuter): Deleted.
* runtime/JSDataView.cpp:
(JSC::JSDataView::possiblySharedTypedImpl):
(JSC::JSDataView::unsharedTypedImpl):
(JSC::JSDataView::getTypedArrayImpl):
(JSC::JSDataView::typedImpl): Deleted.
* runtime/JSDataView.h:
(JSC::JSDataView::possiblySharedBuffer):
(JSC::JSDataView::unsharedBuffer):
(JSC::JSDataView::buffer): Deleted.
* runtime/JSDataViewPrototype.cpp:
(JSC::dataViewProtoGetterBuffer):
* runtime/JSGenericTypedArrayView.h:
(JSC::toPossiblySharedNativeTypedView):
(JSC::toUnsharedNativeTypedView):
(JSC::JSGenericTypedArrayView<Adaptor>::toWrapped):
(JSC::JSGenericTypedArrayView::typedImpl): Deleted.
(JSC::toNativeTypedView): Deleted.
* runtime/JSGenericTypedArrayViewInlines.h:
(JSC::JSGenericTypedArrayView<Adaptor>::create):
(JSC::JSGenericTypedArrayView<Adaptor>::possiblySharedTypedImpl):
(JSC::JSGenericTypedArrayView<Adaptor>::unsharedTypedImpl):
(JSC::JSGenericTypedArrayView<Adaptor>::getTypedArrayImpl):
* runtime/JSGenericTypedArrayViewPrototypeFunctions.h:
(JSC::genericTypedArrayViewProtoGetterFuncBuffer):
(JSC::genericTypedArrayViewPrivateFuncSubarrayCreate):
* runtime/JSGlobalObject.cpp:
(JSC::createAtomicsProperty):
(JSC::JSGlobalObject::init):
(JSC::JSGlobalObject::visitChildren):
* runtime/JSGlobalObject.h:
(JSC::JSGlobalObject::arrayBufferPrototype):
(JSC::JSGlobalObject::arrayBufferStructure):
* runtime/MathObject.cpp:
* runtime/RuntimeFlags.h:
* runtime/SimpleTypedArrayController.cpp:
(JSC::SimpleTypedArrayController::toJS):
* runtime/TypedArrayType.h:
(JSC::typedArrayTypeForType):
Source/WebCore:
New tests added in the LayoutTests/workers/sab directory.
This teaches WebCore that a typed array could be shared or not. By default, WebCore will
reject shared typed arrays as if they were not typed arrays. This ensures that we don't get
race conditions in code that can't handle it.
If you postMessage a SharedArrayBuffer or something that wraps it, you will send the shared
memory to the other worker.
* Modules/encryptedmedia/CDMSessionClearKey.cpp:
(WebCore::CDMSessionClearKey::cachedKeyForKeyID):
* Modules/fetch/FetchBody.cpp:
(WebCore::FetchBody::extract):
* Modules/mediastream/RTCDataChannel.cpp:
(WebCore::RTCDataChannel::send):
* Modules/webaudio/AudioBuffer.cpp:
(WebCore::AudioBuffer::getChannelData):
* Modules/websockets/WebSocket.cpp:
(WebCore::WebSocket::send):
* bindings/js/JSBlobCustom.cpp:
(WebCore::constructJSBlob):
* bindings/js/JSCryptoAlgorithmDictionary.cpp:
(WebCore::createRsaKeyGenParams):
* bindings/js/JSCryptoCustom.cpp:
(WebCore::JSCrypto::getRandomValues):
* bindings/js/JSCryptoOperationData.cpp:
(WebCore::cryptoOperationDataFromJSValue):
* bindings/js/JSDOMBinding.h:
(WebCore::toJS):
(WebCore::toPossiblySharedArrayBufferView):
(WebCore::toUnsharedArrayBufferView):
(WebCore::toPossiblySharedInt8Array):
(WebCore::toPossiblySharedInt16Array):
(WebCore::toPossiblySharedInt32Array):
(WebCore::toPossiblySharedUint8Array):
(WebCore::toPossiblySharedUint8ClampedArray):
(WebCore::toPossiblySharedUint16Array):
(WebCore::toPossiblySharedUint32Array):
(WebCore::toPossiblySharedFloat32Array):
(WebCore::toPossiblySharedFloat64Array):
(WebCore::toUnsharedInt8Array):
(WebCore::toUnsharedInt16Array):
(WebCore::toUnsharedInt32Array):
(WebCore::toUnsharedUint8Array):
(WebCore::toUnsharedUint8ClampedArray):
(WebCore::toUnsharedUint16Array):
(WebCore::toUnsharedUint32Array):
(WebCore::toUnsharedFloat32Array):
(WebCore::toUnsharedFloat64Array):
(WebCore::toArrayBufferView): Deleted.
(WebCore::toInt8Array): Deleted.
(WebCore::toInt16Array): Deleted.
(WebCore::toInt32Array): Deleted.
(WebCore::toUint8Array): Deleted.
(WebCore::toUint8ClampedArray): Deleted.
(WebCore::toUint16Array): Deleted.
(WebCore::toUint32Array): Deleted.
(WebCore::toFloat32Array): Deleted.
(WebCore::toFloat64Array): Deleted.
* bindings/js/JSDataCueCustom.cpp:
(WebCore::constructJSDataCue):
* bindings/js/JSDictionary.cpp:
(WebCore::JSDictionary::convertValue):
* bindings/js/JSFileCustom.cpp:
(WebCore::constructJSFile):
* bindings/js/JSMessagePortCustom.cpp:
(WebCore::extractTransferables):
* bindings/js/JSWebGLRenderingContextBaseCustom.cpp:
(WebCore::dataFunctionf):
(WebCore::dataFunctioni):
(WebCore::dataFunctionMatrix):
* bindings/js/JSXMLHttpRequestCustom.cpp:
(WebCore::JSXMLHttpRequest::send):
* bindings/js/SerializedScriptValue.cpp:
(WebCore::CloneSerializer::dumpArrayBufferView):
(WebCore::CloneSerializer::dumpIfTerminal):
(WebCore::CloneDeserializer::readArrayBufferView):
(WebCore::CloneDeserializer::readTerminal):
(WebCore::SerializedScriptValue::transferArrayBuffers):
* bindings/js/StructuredClone.cpp:
(WebCore::structuredCloneArrayBuffer):
(WebCore::structuredCloneArrayBufferView):
* bindings/scripts/CodeGeneratorJS.pm:
(JSValueToNative):
* css/FontFace.cpp:
(WebCore::FontFace::create):
* html/canvas/WebGL2RenderingContext.cpp:
(WebCore::WebGL2RenderingContext::bufferData):
(WebCore::WebGL2RenderingContext::bufferSubData):
* platform/graphics/avfoundation/MediaPlayerPrivateAVFoundation.cpp:
(WebCore::MediaPlayerPrivateAVFoundation::extractKeyURIKeyIDAndCertificateFromInitData):
Source/WebKit/mac:
Support the RuntimeFlag.
* WebView/WebPreferencesPrivate.h:
Source/WebKit/win:
Support the RuntimeFlag.
* Interfaces/IWebPreferencesPrivate.idl:
Source/WebKit2:
Adds some small things we need for SharedArrayBuffer.
* UIProcess/API/C/WKPreferencesRefPrivate.h:
* UIProcess/API/Cocoa/WKPreferencesPrivate.h:
* WebProcess/InjectedBundle/InjectedBundle.cpp:
(WebKit::InjectedBundle::createWebDataFromUint8Array):
Source/WTF:
Adds some small things we need for SharedArrayBuffer.
* wtf/Atomics.h:
(WTF::Atomic::compareExchangeWeakRelaxed):
(WTF::Atomic::exchangeAdd):
(WTF::Atomic::exchangeAnd):
(WTF::Atomic::exchangeOr):
(WTF::Atomic::exchangeSub):
(WTF::Atomic::exchangeXor):
(WTF::atomicLoad):
(WTF::atomicStore):
(WTF::atomicCompareExchangeWeak):
(WTF::atomicCompareExchangeWeakRelaxed):
(WTF::atomicCompareExchangeStrong):
(WTF::atomicExchangeAdd):
(WTF::atomicExchangeAnd):
(WTF::atomicExchangeOr):
(WTF::atomicExchangeSub):
(WTF::atomicExchangeXor):
(WTF::atomicExchange):
(WTF::Atomic::exchangeAndAdd): Deleted.
(WTF::weakCompareAndSwap): Deleted.
We need to be able to do atomics operations on naked pointers. We also need to be able to do
all of the things that std::atomic does. This adds those things and renames
weakCompareAndSwap to atomicCompareExchangeWeakRelaxed so that we're using consistent
terminology.
* wtf/Bitmap.h:
(WTF::WordType>::concurrentTestAndSet): Renamed weakCompareAndSwap.
(WTF::WordType>::concurrentTestAndClear): Renamed weakCompareAndSwap.
* wtf/FastBitVector.h:
(WTF::FastBitVector::atomicSetAndCheck): Renamed weakCompareAndSwap.
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::unparkOne):
(WTF::ParkingLot::unparkCount):
* wtf/ParkingLot.h:
Added unparkCount(), which lets you unpark some bounded number of threads and returns the
number of threads unparked. This is just a modest extension of unparkAll(). unparkAll() now
just calls unparkCount(ptr, UINT_MAX).
Tools:
Use the right kind of typed array API.
* DumpRenderTree/TestRunner.cpp:
(setAudioResultCallback):
LayoutTests:
Adding tests. This is a work in progress.
* workers/sab: Added.
* workers/sab/simple-worker-1.js: Added.
(onmessage):
* workers/sab/simple-worker-2.js: Added.
(onmessage):
* workers/sab/simple.html: Added.
Canonical link: https://commits.webkit.org/181984@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@208209 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-11-01 03:10:00 +00:00
return threadDatas . size ( ) ;
}
NEVER_INLINE void ParkingLot : : unparkAll ( const void * address )
{
unparkCount ( address , UINT_MAX ) ;
2015-08-11 19:51:35 +00:00
}
[WTF] Introduce Thread class and use RefPtr<Thread> and align Windows Threading implementation semantics to Pthread one
https://bugs.webkit.org/show_bug.cgi?id=170502
Reviewed by Mark Lam.
Source/JavaScriptCore:
* API/tests/CompareAndSwapTest.cpp:
(testCompareAndSwap):
* JavaScriptCore.xcodeproj/project.pbxproj:
* b3/air/testair.cpp:
* b3/testb3.cpp:
(JSC::B3::run):
* bytecode/SuperSampler.cpp:
(JSC::initializeSuperSampler):
* dfg/DFGWorklist.cpp:
* disassembler/Disassembler.cpp:
* heap/Heap.cpp:
(JSC::Heap::lastChanceToFinalize):
(JSC::Heap::notifyIsSafeToCollect):
* heap/Heap.h:
* heap/MachineStackMarker.cpp:
(JSC::MachineThreads::~MachineThreads):
(JSC::MachineThreads::addCurrentThread):
(JSC::MachineThreads::removeThread):
(JSC::MachineThreads::removeThreadIfFound):
(JSC::MachineThreads::MachineThread::MachineThread):
(JSC::MachineThreads::MachineThread::getRegisters):
(JSC::MachineThreads::MachineThread::Registers::stackPointer):
(JSC::MachineThreads::MachineThread::Registers::framePointer):
(JSC::MachineThreads::MachineThread::Registers::instructionPointer):
(JSC::MachineThreads::MachineThread::Registers::llintPC):
(JSC::MachineThreads::MachineThread::captureStack):
(JSC::MachineThreads::tryCopyOtherThreadStack):
(JSC::MachineThreads::tryCopyOtherThreadStacks):
(pthreadSignalHandlerSuspendResume): Deleted.
(JSC::threadData): Deleted.
(JSC::MachineThreads::Thread::Thread): Deleted.
(JSC::MachineThreads::Thread::createForCurrentThread): Deleted.
(JSC::MachineThreads::Thread::operator==): Deleted.
(JSC::MachineThreads::machineThreadForCurrentThread): Deleted.
(JSC::MachineThreads::ThreadData::ThreadData): Deleted.
(JSC::MachineThreads::ThreadData::~ThreadData): Deleted.
(JSC::MachineThreads::ThreadData::suspend): Deleted.
(JSC::MachineThreads::ThreadData::resume): Deleted.
(JSC::MachineThreads::ThreadData::getRegisters): Deleted.
(JSC::MachineThreads::ThreadData::Registers::stackPointer): Deleted.
(JSC::MachineThreads::ThreadData::Registers::framePointer): Deleted.
(JSC::MachineThreads::ThreadData::Registers::instructionPointer): Deleted.
(JSC::MachineThreads::ThreadData::Registers::llintPC): Deleted.
(JSC::MachineThreads::ThreadData::freeRegisters): Deleted.
(JSC::MachineThreads::ThreadData::captureStack): Deleted.
* heap/MachineStackMarker.h:
(JSC::MachineThreads::MachineThread::suspend):
(JSC::MachineThreads::MachineThread::resume):
(JSC::MachineThreads::MachineThread::threadID):
(JSC::MachineThreads::MachineThread::stackBase):
(JSC::MachineThreads::MachineThread::stackEnd):
(JSC::MachineThreads::threadsListHead):
(JSC::MachineThreads::Thread::operator!=): Deleted.
(JSC::MachineThreads::Thread::suspend): Deleted.
(JSC::MachineThreads::Thread::resume): Deleted.
(JSC::MachineThreads::Thread::getRegisters): Deleted.
(JSC::MachineThreads::Thread::freeRegisters): Deleted.
(JSC::MachineThreads::Thread::captureStack): Deleted.
(JSC::MachineThreads::Thread::platformThread): Deleted.
(JSC::MachineThreads::Thread::stackBase): Deleted.
(JSC::MachineThreads::Thread::stackEnd): Deleted.
* jit/ICStats.cpp:
(JSC::ICStats::ICStats):
(JSC::ICStats::~ICStats):
* jit/ICStats.h:
* jsc.cpp:
(functionDollarAgentStart):
(startTimeoutThreadIfNeeded):
* runtime/JSLock.cpp:
(JSC::JSLock::lock):
* runtime/JSLock.h:
(JSC::JSLock::ownerThread):
(JSC::JSLock::currentThreadIsHoldingLock):
* runtime/SamplingProfiler.cpp:
(JSC::FrameWalker::isValidFramePointer):
(JSC::SamplingProfiler::SamplingProfiler):
(JSC::SamplingProfiler::createThreadIfNecessary):
(JSC::SamplingProfiler::takeSample):
* runtime/SamplingProfiler.h:
* runtime/VM.h:
(JSC::VM::ownerThread):
* runtime/VMTraps.cpp:
(JSC::findActiveVMAndStackBounds):
(JSC::VMTraps::SignalSender::send):
(JSC::VMTraps::fireTrap):
Source/WebCore:
Mechanical change. Use Thread:: APIs.
* Modules/indexeddb/server/IDBServer.cpp:
(WebCore::IDBServer::IDBServer::IDBServer):
* Modules/indexeddb/server/IDBServer.h:
* Modules/webaudio/AsyncAudioDecoder.cpp:
(WebCore::AsyncAudioDecoder::AsyncAudioDecoder):
(WebCore::AsyncAudioDecoder::~AsyncAudioDecoder):
(WebCore::AsyncAudioDecoder::runLoop):
* Modules/webaudio/AsyncAudioDecoder.h:
* Modules/webaudio/OfflineAudioDestinationNode.cpp:
(WebCore::OfflineAudioDestinationNode::OfflineAudioDestinationNode):
(WebCore::OfflineAudioDestinationNode::uninitialize):
(WebCore::OfflineAudioDestinationNode::startRendering):
* Modules/webaudio/OfflineAudioDestinationNode.h:
* Modules/webdatabase/Database.cpp:
(WebCore::Database::securityOrigin):
* Modules/webdatabase/DatabaseThread.cpp:
(WebCore::DatabaseThread::start):
(WebCore::DatabaseThread::databaseThread):
(WebCore::DatabaseThread::recordDatabaseOpen):
(WebCore::DatabaseThread::recordDatabaseClosed):
* Modules/webdatabase/DatabaseThread.h:
(WebCore::DatabaseThread::getThreadID):
* bindings/js/GCController.cpp:
(WebCore::GCController::garbageCollectOnAlternateThreadForDebugging):
* fileapi/AsyncFileStream.cpp:
(WebCore::callOnFileThread):
* loader/icon/IconDatabase.cpp:
(WebCore::IconDatabase::open):
(WebCore::IconDatabase::close):
* loader/icon/IconDatabase.h:
* page/ResourceUsageThread.cpp:
(WebCore::ResourceUsageThread::createThreadIfNeeded):
* page/ResourceUsageThread.h:
* page/scrolling/ScrollingThread.cpp:
(WebCore::ScrollingThread::ScrollingThread):
(WebCore::ScrollingThread::isCurrentThread):
(WebCore::ScrollingThread::createThreadIfNeeded):
(WebCore::ScrollingThread::threadCallback):
* page/scrolling/ScrollingThread.h:
* platform/audio/HRTFDatabaseLoader.cpp:
(WebCore::HRTFDatabaseLoader::HRTFDatabaseLoader):
(WebCore::HRTFDatabaseLoader::loadAsynchronously):
(WebCore::HRTFDatabaseLoader::waitForLoaderThreadCompletion):
* platform/audio/HRTFDatabaseLoader.h:
* platform/audio/ReverbConvolver.cpp:
(WebCore::ReverbConvolver::ReverbConvolver):
(WebCore::ReverbConvolver::~ReverbConvolver):
* platform/audio/ReverbConvolver.h:
* platform/audio/gstreamer/AudioFileReaderGStreamer.cpp:
(WebCore::createBusFromAudioFile):
(WebCore::createBusFromInMemoryAudioFile):
* platform/graphics/gstreamer/WebKitWebSourceGStreamer.cpp:
(ResourceHandleStreamingClient::ResourceHandleStreamingClient):
(ResourceHandleStreamingClient::~ResourceHandleStreamingClient):
* platform/network/cf/LoaderRunLoopCF.cpp:
(WebCore::loaderRunLoop):
* platform/network/curl/CurlDownload.cpp:
(WebCore::CurlDownloadManager::startThreadIfNeeded):
(WebCore::CurlDownloadManager::stopThread):
* platform/network/curl/CurlDownload.h:
* platform/network/curl/SocketStreamHandleImpl.h:
* platform/network/curl/SocketStreamHandleImplCurl.cpp:
(WebCore::SocketStreamHandleImpl::startThread):
(WebCore::SocketStreamHandleImpl::stopThread):
* workers/WorkerThread.cpp:
(WebCore::WorkerThread::WorkerThread):
(WebCore::WorkerThread::start):
(WebCore::WorkerThread::workerThread):
* workers/WorkerThread.h:
(WebCore::WorkerThread::threadID):
Source/WebKit:
Mechanical change. Use Thread:: APIs.
* Storage/StorageThread.cpp:
(WebCore::StorageThread::StorageThread):
(WebCore::StorageThread::~StorageThread):
(WebCore::StorageThread::start):
(WebCore::StorageThread::dispatch):
(WebCore::StorageThread::terminate):
* Storage/StorageThread.h:
Source/WebKit2:
Mechanical change. Use Thread:: APIs.
* NetworkProcess/NetworkProcess.cpp:
(WebKit::NetworkProcess::initializeNetworkProcess):
* NetworkProcess/cache/NetworkCacheIOChannelSoup.cpp:
(WebKit::NetworkCache::IOChannel::readSyncInThread):
* Platform/IPC/Connection.cpp:
(IPC::Connection::processIncomingMessage):
* Shared/EntryPointUtilities/mac/XPCService/XPCServiceEntryPoint.h:
(WebKit::XPCServiceInitializer):
* UIProcess/linux/MemoryPressureMonitor.cpp:
(WebKit::MemoryPressureMonitor::MemoryPressureMonitor):
* WebProcess/WebProcess.cpp:
(WebKit::WebProcess::initializeWebProcess):
Source/WTF:
This patch is refactoring of WTF Threading mechanism to merge JavaScriptCore's threading extension to WTF Threading.
Previously, JavaScriptCore requires richer threading features (such as suspending and resuming threads), and they
are implemented for PlatformThread in JavaScriptCore. But these features should be implemented in WTF Threading side
instead of maintaining JSC's own threading features too. This patch removes these features from JSC and move it to
WTF Threading.
However, current WTF Threading has one problem: Windows version of WTF Threading has different semantics from Pthreads
one. In Windows WTF Threading, we cannot perform any operation after the target thread is detached: WTF Threading stop
tracking the state of the thread once the thread is detached. But this is not the same to Pthreads one. In Pthreads,
pthread_detach just means that the resource of the thread will be destroyed automatically. While some operations like
pthread_join will be rejected, some operations like pthread_kill will be accepted.
The problem is that detached thread can be suspended and resumed in JSC. For example, in jsc.cpp, we start the worker
thread and detach it immediately. In worker thread, we will create VM and thus concurrent GC will suspend and resume
the detached thread. However, in Windows WTF Threading, we have no reference to the detached thread. Thus we cannot
perform suspend and resume operations onto the detached thread.
To solve the problem, we change Windows Threading mechanism drastically to align it to the Pthread semantics. In the
new Threading, we have RefPtr<Thread> class. It holds a handle to a platform thread. We can perform threading operations
with this class. For example, Thread::suspend is offered. And we use destructor of the thread local variable to release
the resources held by RefPtr<Thread>. In Windows, Thread::detach does nothing because the resource will be destroyed
automatically by RefPtr<Thread>.
To do so, we introduce ThreadHolder for Windows. This is similar to the previous ThreadIdentifierData for Pthreads.
It holds RefPtr<Thread> in the thread local storage (technically, it is Fiber Local Storage in Windows). Thread::current()
will return this reference.
The problematic situation is that the order of the deallocation of the thread local storage is not defined. So we should
not touch thread local storage in the destructor of the thread local storage. To avoid such edge cases, we have
currentThread() / Thread::currentID() APIs. They are safe to be called even in the destructor of the other thread local
storage. And in Windows, in the FLS destructor, we will create the thread_local variable to defer the destruction of
the ThreadHolder. We ensure that this destructor is called after the other FLS destructors are called in Windows 10.
This patch is performance neutral.
* WTF.xcodeproj/project.pbxproj:
* benchmarks/ConditionSpeedTest.cpp:
* benchmarks/LockFairnessTest.cpp:
* benchmarks/LockSpeedTest.cpp:
* wtf/AutomaticThread.cpp:
(WTF::AutomaticThread::start):
* wtf/CMakeLists.txt:
* wtf/MainThread.h:
* wtf/MemoryPressureHandler.h:
* wtf/ParallelJobsGeneric.cpp:
(WTF::ParallelEnvironment::ThreadPrivate::tryLockFor):
(WTF::ParallelEnvironment::ThreadPrivate::workerThread):
* wtf/ParallelJobsGeneric.h:
(WTF::ParallelEnvironment::ThreadPrivate::ThreadPrivate): Deleted.
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::forEachImpl):
* wtf/ParkingLot.h:
(WTF::ParkingLot::forEach):
* wtf/PlatformRegisters.h: Renamed from Source/JavaScriptCore/runtime/PlatformThread.h.
* wtf/RefPtr.h:
(WTF::RefPtr::RefPtr):
* wtf/ThreadFunctionInvocation.h:
(WTF::ThreadFunctionInvocation::ThreadFunctionInvocation):
* wtf/ThreadHolder.cpp: Added.
(WTF::ThreadHolder::~ThreadHolder):
(WTF::ThreadHolder::initialize):
* wtf/ThreadHolder.h: Renamed from Source/WTF/wtf/ThreadIdentifierDataPthreads.h.
(WTF::ThreadHolder::thread):
(WTF::ThreadHolder::ThreadHolder):
* wtf/ThreadHolderPthreads.cpp: Renamed from Source/WTF/wtf/ThreadIdentifierDataPthreads.cpp.
(WTF::ThreadHolder::initializeOnce):
(WTF::ThreadHolder::current):
(WTF::ThreadHolder::destruct):
* wtf/ThreadHolderWin.cpp: Added.
(WTF::threadMapMutex):
(WTF::threadMap):
(WTF::ThreadHolder::initializeOnce):
(WTF::ThreadHolder::current):
(WTF::ThreadHolder::destruct):
* wtf/ThreadSpecific.h:
* wtf/Threading.cpp:
(WTF::Thread::normalizeThreadName):
(WTF::threadEntryPoint):
(WTF::Thread::create):
(WTF::Thread::setCurrentThreadIsUserInteractive):
(WTF::Thread::setCurrentThreadIsUserInitiated):
(WTF::Thread::setGlobalMaxQOSClass):
(WTF::Thread::adjustedQOSClass):
(WTF::Thread::dump):
(WTF::initializeThreading):
(WTF::normalizeThreadName): Deleted.
(WTF::createThread): Deleted.
(WTF::setCurrentThreadIsUserInteractive): Deleted.
(WTF::setCurrentThreadIsUserInitiated): Deleted.
(WTF::setGlobalMaxQOSClass): Deleted.
(WTF::adjustedQOSClass): Deleted.
* wtf/Threading.h:
(WTF::Thread::id):
(WTF::Thread::operator==):
(WTF::Thread::operator!=):
(WTF::Thread::joinableState):
(WTF::Thread::didBecomeDetached):
(WTF::Thread::didJoin):
(WTF::Thread::hasExited):
(WTF::currentThread):
* wtf/ThreadingPthreads.cpp:
(WTF::Thread::Thread):
(WTF::Thread::~Thread):
(WTF::Thread::signalHandlerSuspendResume):
(WTF::Thread::initializePlatformThreading):
(WTF::initializeCurrentThreadEvenIfNonWTFCreated):
(WTF::wtfThreadEntryPoint):
(WTF::Thread::createInternal):
(WTF::Thread::initializeCurrentThreadInternal):
(WTF::Thread::changePriority):
(WTF::Thread::waitForCompletion):
(WTF::Thread::detach):
(WTF::Thread::current):
(WTF::Thread::currentID):
(WTF::Thread::signal):
(WTF::Thread::resume):
(WTF::Thread::getRegisters):
(WTF::Thread::didExit):
(WTF::Thread::establish):
(WTF::PthreadState::PthreadState): Deleted.
(WTF::PthreadState::joinableState): Deleted.
(WTF::PthreadState::pthreadHandle): Deleted.
(WTF::PthreadState::didBecomeDetached): Deleted.
(WTF::PthreadState::didExit): Deleted.
(WTF::PthreadState::didJoin): Deleted.
(WTF::PthreadState::hasExited): Deleted.
(WTF::threadMapMutex): Deleted.
(WTF::initializeThreading): Deleted.
(WTF::threadMap): Deleted.
(WTF::identifierByPthreadHandle): Deleted.
(WTF::establishIdentifierForPthreadHandle): Deleted.
(WTF::pthreadHandleForIdentifierWithLockAlreadyHeld): Deleted.
(WTF::createThreadInternal): Deleted.
(WTF::initializeCurrentThreadInternal): Deleted.
(WTF::changeThreadPriority): Deleted.
(WTF::waitForThreadCompletion): Deleted.
(WTF::detachThread): Deleted.
(WTF::threadDidExit): Deleted.
(WTF::currentThread): Deleted.
(WTF::signalThread): Deleted.
* wtf/ThreadingWin.cpp:
(WTF::Thread::Thread):
(WTF::Thread::~Thread):
(WTF::Thread::initializeCurrentThreadInternal):
(WTF::Thread::initializePlatformThreading):
(WTF::wtfThreadEntryPoint):
(WTF::Thread::createInternal):
(WTF::Thread::changePriority):
(WTF::Thread::waitForCompletion):
(WTF::Thread::detach):
(WTF::Thread::resume):
(WTF::Thread::getRegisters):
(WTF::Thread::current):
(WTF::Thread::currentID):
(WTF::Thread::didExit):
(WTF::Thread::establish):
(WTF::initializeCurrentThreadInternal): Deleted.
(WTF::threadMapMutex): Deleted.
(WTF::initializeThreading): Deleted.
(WTF::threadMap): Deleted.
(WTF::storeThreadHandleByIdentifier): Deleted.
(WTF::threadHandleForIdentifier): Deleted.
(WTF::clearThreadHandleForIdentifier): Deleted.
(WTF::createThreadInternal): Deleted.
(WTF::changeThreadPriority): Deleted.
(WTF::waitForThreadCompletion): Deleted.
(WTF::detachThread): Deleted.
(WTF::currentThread): Deleted.
* wtf/WorkQueue.cpp:
(WTF::WorkQueue::concurrentApply):
* wtf/WorkQueue.h:
* wtf/cocoa/WorkQueueCocoa.cpp:
(WTF::dispatchQOSClass):
* wtf/generic/WorkQueueGeneric.cpp:
(WorkQueue::platformInitialize):
(WorkQueue::platformInvalidate):
* wtf/linux/MemoryPressureHandlerLinux.cpp:
(WTF::MemoryPressureHandler::EventFDPoller::EventFDPoller):
(WTF::MemoryPressureHandler::EventFDPoller::~EventFDPoller):
* wtf/win/MainThreadWin.cpp:
(WTF::initializeMainThreadPlatform):
Tools:
Mechanical change. Use Thread:: APIs.
* DumpRenderTree/JavaScriptThreading.cpp:
(runJavaScriptThread):
(startJavaScriptThreads):
(stopJavaScriptThreads):
* DumpRenderTree/mac/DumpRenderTree.mm:
(testThreadIdentifierMap):
* TestWebKitAPI/Tests/WTF/Condition.cpp:
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::runLockTest):
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/187690@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@215265 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-04-12 12:08:29 +00:00
NEVER_INLINE void ParkingLot : : forEachImpl ( const ScopedLambda < void ( Thread & , const void * ) > & callback )
2015-08-11 19:51:35 +00:00
{
Vector < Bucket * > bucketsToUnlock = lockHashtable ( ) ;
Hashtable * currentHashtable = hashtable . load ( ) ;
for ( unsigned i = currentHashtable - > size ; i - - ; ) {
Bucket * bucket = currentHashtable - > data [ i ] . load ( ) ;
if ( ! bucket )
continue ;
for ( ThreadData * currentThreadData = bucket - > queueHead ; currentThreadData ; currentThreadData = currentThreadData - > nextInQueue )
[WTF] Introduce Thread class and use RefPtr<Thread> and align Windows Threading implementation semantics to Pthread one
https://bugs.webkit.org/show_bug.cgi?id=170502
Reviewed by Mark Lam.
Source/JavaScriptCore:
* API/tests/CompareAndSwapTest.cpp:
(testCompareAndSwap):
* JavaScriptCore.xcodeproj/project.pbxproj:
* b3/air/testair.cpp:
* b3/testb3.cpp:
(JSC::B3::run):
* bytecode/SuperSampler.cpp:
(JSC::initializeSuperSampler):
* dfg/DFGWorklist.cpp:
* disassembler/Disassembler.cpp:
* heap/Heap.cpp:
(JSC::Heap::lastChanceToFinalize):
(JSC::Heap::notifyIsSafeToCollect):
* heap/Heap.h:
* heap/MachineStackMarker.cpp:
(JSC::MachineThreads::~MachineThreads):
(JSC::MachineThreads::addCurrentThread):
(JSC::MachineThreads::removeThread):
(JSC::MachineThreads::removeThreadIfFound):
(JSC::MachineThreads::MachineThread::MachineThread):
(JSC::MachineThreads::MachineThread::getRegisters):
(JSC::MachineThreads::MachineThread::Registers::stackPointer):
(JSC::MachineThreads::MachineThread::Registers::framePointer):
(JSC::MachineThreads::MachineThread::Registers::instructionPointer):
(JSC::MachineThreads::MachineThread::Registers::llintPC):
(JSC::MachineThreads::MachineThread::captureStack):
(JSC::MachineThreads::tryCopyOtherThreadStack):
(JSC::MachineThreads::tryCopyOtherThreadStacks):
(pthreadSignalHandlerSuspendResume): Deleted.
(JSC::threadData): Deleted.
(JSC::MachineThreads::Thread::Thread): Deleted.
(JSC::MachineThreads::Thread::createForCurrentThread): Deleted.
(JSC::MachineThreads::Thread::operator==): Deleted.
(JSC::MachineThreads::machineThreadForCurrentThread): Deleted.
(JSC::MachineThreads::ThreadData::ThreadData): Deleted.
(JSC::MachineThreads::ThreadData::~ThreadData): Deleted.
(JSC::MachineThreads::ThreadData::suspend): Deleted.
(JSC::MachineThreads::ThreadData::resume): Deleted.
(JSC::MachineThreads::ThreadData::getRegisters): Deleted.
(JSC::MachineThreads::ThreadData::Registers::stackPointer): Deleted.
(JSC::MachineThreads::ThreadData::Registers::framePointer): Deleted.
(JSC::MachineThreads::ThreadData::Registers::instructionPointer): Deleted.
(JSC::MachineThreads::ThreadData::Registers::llintPC): Deleted.
(JSC::MachineThreads::ThreadData::freeRegisters): Deleted.
(JSC::MachineThreads::ThreadData::captureStack): Deleted.
* heap/MachineStackMarker.h:
(JSC::MachineThreads::MachineThread::suspend):
(JSC::MachineThreads::MachineThread::resume):
(JSC::MachineThreads::MachineThread::threadID):
(JSC::MachineThreads::MachineThread::stackBase):
(JSC::MachineThreads::MachineThread::stackEnd):
(JSC::MachineThreads::threadsListHead):
(JSC::MachineThreads::Thread::operator!=): Deleted.
(JSC::MachineThreads::Thread::suspend): Deleted.
(JSC::MachineThreads::Thread::resume): Deleted.
(JSC::MachineThreads::Thread::getRegisters): Deleted.
(JSC::MachineThreads::Thread::freeRegisters): Deleted.
(JSC::MachineThreads::Thread::captureStack): Deleted.
(JSC::MachineThreads::Thread::platformThread): Deleted.
(JSC::MachineThreads::Thread::stackBase): Deleted.
(JSC::MachineThreads::Thread::stackEnd): Deleted.
* jit/ICStats.cpp:
(JSC::ICStats::ICStats):
(JSC::ICStats::~ICStats):
* jit/ICStats.h:
* jsc.cpp:
(functionDollarAgentStart):
(startTimeoutThreadIfNeeded):
* runtime/JSLock.cpp:
(JSC::JSLock::lock):
* runtime/JSLock.h:
(JSC::JSLock::ownerThread):
(JSC::JSLock::currentThreadIsHoldingLock):
* runtime/SamplingProfiler.cpp:
(JSC::FrameWalker::isValidFramePointer):
(JSC::SamplingProfiler::SamplingProfiler):
(JSC::SamplingProfiler::createThreadIfNecessary):
(JSC::SamplingProfiler::takeSample):
* runtime/SamplingProfiler.h:
* runtime/VM.h:
(JSC::VM::ownerThread):
* runtime/VMTraps.cpp:
(JSC::findActiveVMAndStackBounds):
(JSC::VMTraps::SignalSender::send):
(JSC::VMTraps::fireTrap):
Source/WebCore:
Mechanical change. Use Thread:: APIs.
* Modules/indexeddb/server/IDBServer.cpp:
(WebCore::IDBServer::IDBServer::IDBServer):
* Modules/indexeddb/server/IDBServer.h:
* Modules/webaudio/AsyncAudioDecoder.cpp:
(WebCore::AsyncAudioDecoder::AsyncAudioDecoder):
(WebCore::AsyncAudioDecoder::~AsyncAudioDecoder):
(WebCore::AsyncAudioDecoder::runLoop):
* Modules/webaudio/AsyncAudioDecoder.h:
* Modules/webaudio/OfflineAudioDestinationNode.cpp:
(WebCore::OfflineAudioDestinationNode::OfflineAudioDestinationNode):
(WebCore::OfflineAudioDestinationNode::uninitialize):
(WebCore::OfflineAudioDestinationNode::startRendering):
* Modules/webaudio/OfflineAudioDestinationNode.h:
* Modules/webdatabase/Database.cpp:
(WebCore::Database::securityOrigin):
* Modules/webdatabase/DatabaseThread.cpp:
(WebCore::DatabaseThread::start):
(WebCore::DatabaseThread::databaseThread):
(WebCore::DatabaseThread::recordDatabaseOpen):
(WebCore::DatabaseThread::recordDatabaseClosed):
* Modules/webdatabase/DatabaseThread.h:
(WebCore::DatabaseThread::getThreadID):
* bindings/js/GCController.cpp:
(WebCore::GCController::garbageCollectOnAlternateThreadForDebugging):
* fileapi/AsyncFileStream.cpp:
(WebCore::callOnFileThread):
* loader/icon/IconDatabase.cpp:
(WebCore::IconDatabase::open):
(WebCore::IconDatabase::close):
* loader/icon/IconDatabase.h:
* page/ResourceUsageThread.cpp:
(WebCore::ResourceUsageThread::createThreadIfNeeded):
* page/ResourceUsageThread.h:
* page/scrolling/ScrollingThread.cpp:
(WebCore::ScrollingThread::ScrollingThread):
(WebCore::ScrollingThread::isCurrentThread):
(WebCore::ScrollingThread::createThreadIfNeeded):
(WebCore::ScrollingThread::threadCallback):
* page/scrolling/ScrollingThread.h:
* platform/audio/HRTFDatabaseLoader.cpp:
(WebCore::HRTFDatabaseLoader::HRTFDatabaseLoader):
(WebCore::HRTFDatabaseLoader::loadAsynchronously):
(WebCore::HRTFDatabaseLoader::waitForLoaderThreadCompletion):
* platform/audio/HRTFDatabaseLoader.h:
* platform/audio/ReverbConvolver.cpp:
(WebCore::ReverbConvolver::ReverbConvolver):
(WebCore::ReverbConvolver::~ReverbConvolver):
* platform/audio/ReverbConvolver.h:
* platform/audio/gstreamer/AudioFileReaderGStreamer.cpp:
(WebCore::createBusFromAudioFile):
(WebCore::createBusFromInMemoryAudioFile):
* platform/graphics/gstreamer/WebKitWebSourceGStreamer.cpp:
(ResourceHandleStreamingClient::ResourceHandleStreamingClient):
(ResourceHandleStreamingClient::~ResourceHandleStreamingClient):
* platform/network/cf/LoaderRunLoopCF.cpp:
(WebCore::loaderRunLoop):
* platform/network/curl/CurlDownload.cpp:
(WebCore::CurlDownloadManager::startThreadIfNeeded):
(WebCore::CurlDownloadManager::stopThread):
* platform/network/curl/CurlDownload.h:
* platform/network/curl/SocketStreamHandleImpl.h:
* platform/network/curl/SocketStreamHandleImplCurl.cpp:
(WebCore::SocketStreamHandleImpl::startThread):
(WebCore::SocketStreamHandleImpl::stopThread):
* workers/WorkerThread.cpp:
(WebCore::WorkerThread::WorkerThread):
(WebCore::WorkerThread::start):
(WebCore::WorkerThread::workerThread):
* workers/WorkerThread.h:
(WebCore::WorkerThread::threadID):
Source/WebKit:
Mechanical change. Use Thread:: APIs.
* Storage/StorageThread.cpp:
(WebCore::StorageThread::StorageThread):
(WebCore::StorageThread::~StorageThread):
(WebCore::StorageThread::start):
(WebCore::StorageThread::dispatch):
(WebCore::StorageThread::terminate):
* Storage/StorageThread.h:
Source/WebKit2:
Mechanical change. Use Thread:: APIs.
* NetworkProcess/NetworkProcess.cpp:
(WebKit::NetworkProcess::initializeNetworkProcess):
* NetworkProcess/cache/NetworkCacheIOChannelSoup.cpp:
(WebKit::NetworkCache::IOChannel::readSyncInThread):
* Platform/IPC/Connection.cpp:
(IPC::Connection::processIncomingMessage):
* Shared/EntryPointUtilities/mac/XPCService/XPCServiceEntryPoint.h:
(WebKit::XPCServiceInitializer):
* UIProcess/linux/MemoryPressureMonitor.cpp:
(WebKit::MemoryPressureMonitor::MemoryPressureMonitor):
* WebProcess/WebProcess.cpp:
(WebKit::WebProcess::initializeWebProcess):
Source/WTF:
This patch is refactoring of WTF Threading mechanism to merge JavaScriptCore's threading extension to WTF Threading.
Previously, JavaScriptCore requires richer threading features (such as suspending and resuming threads), and they
are implemented for PlatformThread in JavaScriptCore. But these features should be implemented in WTF Threading side
instead of maintaining JSC's own threading features too. This patch removes these features from JSC and move it to
WTF Threading.
However, current WTF Threading has one problem: Windows version of WTF Threading has different semantics from Pthreads
one. In Windows WTF Threading, we cannot perform any operation after the target thread is detached: WTF Threading stop
tracking the state of the thread once the thread is detached. But this is not the same to Pthreads one. In Pthreads,
pthread_detach just means that the resource of the thread will be destroyed automatically. While some operations like
pthread_join will be rejected, some operations like pthread_kill will be accepted.
The problem is that detached thread can be suspended and resumed in JSC. For example, in jsc.cpp, we start the worker
thread and detach it immediately. In worker thread, we will create VM and thus concurrent GC will suspend and resume
the detached thread. However, in Windows WTF Threading, we have no reference to the detached thread. Thus we cannot
perform suspend and resume operations onto the detached thread.
To solve the problem, we change Windows Threading mechanism drastically to align it to the Pthread semantics. In the
new Threading, we have RefPtr<Thread> class. It holds a handle to a platform thread. We can perform threading operations
with this class. For example, Thread::suspend is offered. And we use destructor of the thread local variable to release
the resources held by RefPtr<Thread>. In Windows, Thread::detach does nothing because the resource will be destroyed
automatically by RefPtr<Thread>.
To do so, we introduce ThreadHolder for Windows. This is similar to the previous ThreadIdentifierData for Pthreads.
It holds RefPtr<Thread> in the thread local storage (technically, it is Fiber Local Storage in Windows). Thread::current()
will return this reference.
The problematic situation is that the order of the deallocation of the thread local storage is not defined. So we should
not touch thread local storage in the destructor of the thread local storage. To avoid such edge cases, we have
currentThread() / Thread::currentID() APIs. They are safe to be called even in the destructor of the other thread local
storage. And in Windows, in the FLS destructor, we will create the thread_local variable to defer the destruction of
the ThreadHolder. We ensure that this destructor is called after the other FLS destructors are called in Windows 10.
This patch is performance neutral.
* WTF.xcodeproj/project.pbxproj:
* benchmarks/ConditionSpeedTest.cpp:
* benchmarks/LockFairnessTest.cpp:
* benchmarks/LockSpeedTest.cpp:
* wtf/AutomaticThread.cpp:
(WTF::AutomaticThread::start):
* wtf/CMakeLists.txt:
* wtf/MainThread.h:
* wtf/MemoryPressureHandler.h:
* wtf/ParallelJobsGeneric.cpp:
(WTF::ParallelEnvironment::ThreadPrivate::tryLockFor):
(WTF::ParallelEnvironment::ThreadPrivate::workerThread):
* wtf/ParallelJobsGeneric.h:
(WTF::ParallelEnvironment::ThreadPrivate::ThreadPrivate): Deleted.
* wtf/ParkingLot.cpp:
(WTF::ParkingLot::forEachImpl):
* wtf/ParkingLot.h:
(WTF::ParkingLot::forEach):
* wtf/PlatformRegisters.h: Renamed from Source/JavaScriptCore/runtime/PlatformThread.h.
* wtf/RefPtr.h:
(WTF::RefPtr::RefPtr):
* wtf/ThreadFunctionInvocation.h:
(WTF::ThreadFunctionInvocation::ThreadFunctionInvocation):
* wtf/ThreadHolder.cpp: Added.
(WTF::ThreadHolder::~ThreadHolder):
(WTF::ThreadHolder::initialize):
* wtf/ThreadHolder.h: Renamed from Source/WTF/wtf/ThreadIdentifierDataPthreads.h.
(WTF::ThreadHolder::thread):
(WTF::ThreadHolder::ThreadHolder):
* wtf/ThreadHolderPthreads.cpp: Renamed from Source/WTF/wtf/ThreadIdentifierDataPthreads.cpp.
(WTF::ThreadHolder::initializeOnce):
(WTF::ThreadHolder::current):
(WTF::ThreadHolder::destruct):
* wtf/ThreadHolderWin.cpp: Added.
(WTF::threadMapMutex):
(WTF::threadMap):
(WTF::ThreadHolder::initializeOnce):
(WTF::ThreadHolder::current):
(WTF::ThreadHolder::destruct):
* wtf/ThreadSpecific.h:
* wtf/Threading.cpp:
(WTF::Thread::normalizeThreadName):
(WTF::threadEntryPoint):
(WTF::Thread::create):
(WTF::Thread::setCurrentThreadIsUserInteractive):
(WTF::Thread::setCurrentThreadIsUserInitiated):
(WTF::Thread::setGlobalMaxQOSClass):
(WTF::Thread::adjustedQOSClass):
(WTF::Thread::dump):
(WTF::initializeThreading):
(WTF::normalizeThreadName): Deleted.
(WTF::createThread): Deleted.
(WTF::setCurrentThreadIsUserInteractive): Deleted.
(WTF::setCurrentThreadIsUserInitiated): Deleted.
(WTF::setGlobalMaxQOSClass): Deleted.
(WTF::adjustedQOSClass): Deleted.
* wtf/Threading.h:
(WTF::Thread::id):
(WTF::Thread::operator==):
(WTF::Thread::operator!=):
(WTF::Thread::joinableState):
(WTF::Thread::didBecomeDetached):
(WTF::Thread::didJoin):
(WTF::Thread::hasExited):
(WTF::currentThread):
* wtf/ThreadingPthreads.cpp:
(WTF::Thread::Thread):
(WTF::Thread::~Thread):
(WTF::Thread::signalHandlerSuspendResume):
(WTF::Thread::initializePlatformThreading):
(WTF::initializeCurrentThreadEvenIfNonWTFCreated):
(WTF::wtfThreadEntryPoint):
(WTF::Thread::createInternal):
(WTF::Thread::initializeCurrentThreadInternal):
(WTF::Thread::changePriority):
(WTF::Thread::waitForCompletion):
(WTF::Thread::detach):
(WTF::Thread::current):
(WTF::Thread::currentID):
(WTF::Thread::signal):
(WTF::Thread::resume):
(WTF::Thread::getRegisters):
(WTF::Thread::didExit):
(WTF::Thread::establish):
(WTF::PthreadState::PthreadState): Deleted.
(WTF::PthreadState::joinableState): Deleted.
(WTF::PthreadState::pthreadHandle): Deleted.
(WTF::PthreadState::didBecomeDetached): Deleted.
(WTF::PthreadState::didExit): Deleted.
(WTF::PthreadState::didJoin): Deleted.
(WTF::PthreadState::hasExited): Deleted.
(WTF::threadMapMutex): Deleted.
(WTF::initializeThreading): Deleted.
(WTF::threadMap): Deleted.
(WTF::identifierByPthreadHandle): Deleted.
(WTF::establishIdentifierForPthreadHandle): Deleted.
(WTF::pthreadHandleForIdentifierWithLockAlreadyHeld): Deleted.
(WTF::createThreadInternal): Deleted.
(WTF::initializeCurrentThreadInternal): Deleted.
(WTF::changeThreadPriority): Deleted.
(WTF::waitForThreadCompletion): Deleted.
(WTF::detachThread): Deleted.
(WTF::threadDidExit): Deleted.
(WTF::currentThread): Deleted.
(WTF::signalThread): Deleted.
* wtf/ThreadingWin.cpp:
(WTF::Thread::Thread):
(WTF::Thread::~Thread):
(WTF::Thread::initializeCurrentThreadInternal):
(WTF::Thread::initializePlatformThreading):
(WTF::wtfThreadEntryPoint):
(WTF::Thread::createInternal):
(WTF::Thread::changePriority):
(WTF::Thread::waitForCompletion):
(WTF::Thread::detach):
(WTF::Thread::resume):
(WTF::Thread::getRegisters):
(WTF::Thread::current):
(WTF::Thread::currentID):
(WTF::Thread::didExit):
(WTF::Thread::establish):
(WTF::initializeCurrentThreadInternal): Deleted.
(WTF::threadMapMutex): Deleted.
(WTF::initializeThreading): Deleted.
(WTF::threadMap): Deleted.
(WTF::storeThreadHandleByIdentifier): Deleted.
(WTF::threadHandleForIdentifier): Deleted.
(WTF::clearThreadHandleForIdentifier): Deleted.
(WTF::createThreadInternal): Deleted.
(WTF::changeThreadPriority): Deleted.
(WTF::waitForThreadCompletion): Deleted.
(WTF::detachThread): Deleted.
(WTF::currentThread): Deleted.
* wtf/WorkQueue.cpp:
(WTF::WorkQueue::concurrentApply):
* wtf/WorkQueue.h:
* wtf/cocoa/WorkQueueCocoa.cpp:
(WTF::dispatchQOSClass):
* wtf/generic/WorkQueueGeneric.cpp:
(WorkQueue::platformInitialize):
(WorkQueue::platformInvalidate):
* wtf/linux/MemoryPressureHandlerLinux.cpp:
(WTF::MemoryPressureHandler::EventFDPoller::EventFDPoller):
(WTF::MemoryPressureHandler::EventFDPoller::~EventFDPoller):
* wtf/win/MainThreadWin.cpp:
(WTF::initializeMainThreadPlatform):
Tools:
Mechanical change. Use Thread:: APIs.
* DumpRenderTree/JavaScriptThreading.cpp:
(runJavaScriptThread):
(startJavaScriptThreads):
(stopJavaScriptThreads):
* DumpRenderTree/mac/DumpRenderTree.mm:
(testThreadIdentifierMap):
* TestWebKitAPI/Tests/WTF/Condition.cpp:
* TestWebKitAPI/Tests/WTF/Lock.cpp:
(TestWebKitAPI::runLockTest):
* TestWebKitAPI/Tests/WTF/ParkingLot.cpp:
Canonical link: https://commits.webkit.org/187690@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@215265 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-04-12 12:08:29 +00:00
callback ( currentThreadData - > thread . get ( ) , currentThreadData - > address ) ;
2015-08-11 19:51:35 +00:00
}
unlockHashtable ( bucketsToUnlock ) ;
}
} // namespace WTF