Streaming model for massive vertex and index creations: minimize memory allocations, range checks and other conditions
Reduce number of segments length calculations (sqrt) and bbox (min and max).
Update distances and bboxes on a whole buffer and only if necessary. For shapes without strokes compute distances not necessary at all. bbox can be updated only on the final stage of geometry workflow, but not on the each stage.
Using stack memory instead of heap. its more cache friendly and did not fragment memory, faster memory allocations (weak place of realization)
Using cache for points distances and whole path length. Updates only if necessary
Validation of geometry consistency executes only on the final stage of path life cicle. It more friendly for data streaming: no any conditions and branches.
Using binary search for strokes trimming
Pre-cached circles geometry for caps and joints
Refactored strokes elements generation functions. Code is more readable and modifiable in general. Can be easily fixed if some geometry issues will be finded
- used hardware blending stage for scene blending
- used AABB for scene blending
- reduced number of offfscreen buffers coping
- reduced number of render pass switching
- used render pipelines abilities to convert offscreen pixel format to screen pixel format
- removed unused shaders
Deep shader refactoring for the following purposes:
* used pre-calculated gradient texture instead of per-pixel gradient map computation
* used HW wrap samples for fill spread setting
* unified gradient shader types
* used single shader module for composition instead of signle module per composition type
* used single shader module for blending for each of fill type (solid, gradient, image) instaed of signle module per blend type
* much easier add new composition and blend equations
* get rided std::string uasge
* shaders code is more readable
* bind groups creation in real time removed - performance boost
* blend and composition shaders decomposed - performance boost
* shader modules and pipeline layouts generalized - less memory usage
* shared single stencil buffer used - less memory usage
* bind groups usage simplified
* general context API simplified and generalized
* all rendering logic moved into new composition class
* ready for hardware MSAA (in next steps)
* ready for direct mask applience (in next steps)