Intel ITB999ASGE1 User Manual
Additional components for Performance and Productivity
Parallel Algorithms
Generic implementation of
common patterns
Generic implementation of
common patterns
Generic implementations of parallel patterns such as parallel loops, flow graphs, and pipelines can be an
easy way to achieve a scalable parallel implementation without developing a custom solution from scratch.
easy way to achieve a scalable parallel implementation without developing a custom solution from scratch.
Concurrent Containers
Generic implementation of
common idioms for
concurrent access
Generic implementation of
common idioms for
concurrent access
Intel® Threading Building Blocks (Intel® TBB) concurrent containers are a concurrency-friendly alternative
to serial data containers. Serial data structures (such as C++ STL containers) often require a global lock to
protect them from concurrent access and modification; Intel TBB concurrent containers allow multiple
threads to concurrently access and update items in the container increasing allowed concurrency and
improving an application’s scalability.
to serial data containers. Serial data structures (such as C++ STL containers) often require a global lock to
protect them from concurrent access and modification; Intel TBB concurrent containers allow multiple
threads to concurrently access and update items in the container increasing allowed concurrency and
improving an application’s scalability.
Synchronization Primitives
Exception-safe locks,
condition variables, and
atomic operations
Exception-safe locks,
condition variables, and
atomic operations
Intel TBB provides a comprehensive set of synchronization primitives with different qualities that are
applicable to common synchronization strategies. Exception-safe implementation of locks helps to avoid a
dead-lock in programs which use C++ exceptions. Usage of Intel TBB atomic variables instead of the C-
style atomic API minimizes potential data races.
applicable to common synchronization strategies. Exception-safe implementation of locks helps to avoid a
dead-lock in programs which use C++ exceptions. Usage of Intel TBB atomic variables instead of the C-
style atomic API minimizes potential data races.
Scalable Memory Allocators
Scalable memory manager
and false-sharing free
memory allocator
Scalable memory manager
and false-sharing free
memory allocator
The scalable memory allocator avoids scalability bottlenecks by minimizing access to a shared memory
heap via per-thread memory pool management. Special management of large (≥8KB) blocks allows more
efficient resource usage, while still offering scalability and competitive performance. The cache-aligned
memory allocator avoids false-sharing by not allowing allocated memory blocks to split a cache line.
heap via per-thread memory pool management. Special management of large (≥8KB) blocks allows more
efficient resource usage, while still offering scalability and competitive performance. The cache-aligned
memory allocator avoids false-sharing by not allowing allocated memory blocks to split a cache line.
Create arbitrary task trees
When an algorithm cannot be expressed with high-level Intel TBB constructs, the user can choose to
create arbitrary task trees. Tasks can be spawned for better locality and performance or en-queued to
maintain FIFO-like order and ensure starvation-resistant execution.
create arbitrary task trees. Tasks can be spawned for better locality and performance or en-queued to
maintain FIFO-like order and ensure starvation-resistant execution.
Conditional Numerical
Reproducibility
Reproducibility
Ensure deterministic associativity for floating-point arithmetic results with the new Intel TBB template
function ‘parallel_deterministic_reduce’.
function ‘parallel_deterministic_reduce’.
C++11 Support
Intel TBB can be used with C++11 compilers and supports lambda expressions. For developers using
parallel algorithms, lambda expressions reduce the time and code needed by removing the requirement for
separate objects or classes.
parallel algorithms, lambda expressions reduce the time and code needed by removing the requirement for
separate objects or classes.
Select the right Intel® Threading Building Blocks (Intel® TBB) license
Commercial Binary Distribution for customers who may require commercial support services. Attractive pricing available for academic,
student and classroom usage.
Open Source Distribution can be used under GPLv2 with the runtime exception allowing usage in proprietary applications. Allows support
for additional OSs and hardware platforms. Both source and binary forms are available for download from
Custom license available if you require the ability to modify or distribute the commercial source code of Intel TBB. Contact your Intel
representative for more information.
What’s New in version 4.2
Feature
Benefit
Support for Latest Intel
Architectures
Architectures
Take advantage of the newest features in Intel’s latest processors including Transactional Synchronization
Extensions (TSX). Adds support for Intel® Xeon Phi™ coprocessor for Windows and Intel® Xeon™ Processor
(Ivy Bridge-EP).
Selecting the best models for your application today will set a path for you to take full advantage of
multicore and many-core performance without re-writing your code. Start today by implementing
parallelism for today’s architecture and be ready for future architectures.
Extensions (TSX). Adds support for Intel® Xeon Phi™ coprocessor for Windows and Intel® Xeon™ Processor
(Ivy Bridge-EP).
Selecting the best models for your application today will set a path for you to take full advantage of
multicore and many-core performance without re-writing your code. Start today by implementing
parallelism for today’s architecture and be ready for future architectures.
Lower memory overhead
Improved heuristics in the memory allocator reduce memory overhead by intelligently releasing unused or
stale memory.
stale memory.
Improved handling of large
memory requests
memory requests
Improved handling of large (>8K-128MB) memory requests results in better performance when using
frequent large memory allocations. Use of big memory pages can now be explicitly enabled via a function
call or environment variable.
frequent large memory allocations. Use of big memory pages can now be explicitly enabled via a function
call or environment variable.
Better Fork Support
Fork safety through a user enabled API that ensures Intel TBB worker threads are completed before
executing a fork.
executing a fork.
PPL* Compatibility
Improved compatibility with Parallel Patterns Library (PPL) by adding concurrent_unordered_multimap and
concurrent_unordered_multiset API’s.
concurrent_unordered_multiset API’s.
Windows* Store
Customers that use Intel TBB in their applications can now submit and sell their app through the Windows
Store.
Store.