revert d520da2db8
After considered the scneario seriously,
this doesn't helpful for the performance at all...
just increased code complexity. earlier bad decision... revert it.
After 362d2df the fastTrack cases were applied even for shapes with clips.
These changes fixed this - the check whether a shape is a rect should be done
only if it has no clips.
This patch introduces _STRAIGHT colorspaces (ABGR8888_STRAIGHT and
ARGB8888_STRAIGHT) whose colors are un-alpha-premultiplied. Unmultiplicated
colors are especially needed for wasm thorvg loader and svg2png / tvg2png.
Only C version now.
@issue: #791
there was a wrong condition introduced the bug that image was not updated,
because transformation is not re-applied after the first created the internal image data.
@Issues: https://github.com/Samsung/thorvg/issues/751
Separate simd implementation by files to maintain them easier.
Now we have avx, c, neon version implementation base,
we can add implementations to them.
Changes:
- Prepare neon verison of ALPHA_BLEND API.
- Use ALPHA_BLEND_NEON in _translucentRle
Notes:
- _translucentRle with neon support reduces execution time of this
function ~ 300 % (measured on uint32_t 400 x 400 buffer).
- API was tested on ARMv7l device with GCC 9.2 based toolchain. Results
on other devices could be different.
Pixel-based image picture doesn't work at size() method.
Obvisouly, we missed to implement it properly.
This patch corrects it.
@Issue: https://github.com/Samsung/thorvg/issues/656
Changes:
- Added 'neon' vector option in build system
- Introduced neon version of rasterRGBA32 API, which improves
speed of the funciton on ARM cpu's around ~35%
changed alpha channel data type to 32 bits from 8 bits,
since subsequent data operations requires 32 bits values.
this 8 bits (since channel range is up to 255) doesn't helpful
for saving memory size because it would generate additional data casting by compiler.
I compared the binary size and this patch saves about 600bytes.
Calculations accuracy in ALPHA_BLEND function has been
improved. Until now blending resulted in a slight hue change
(all color channels affected). The chosen method of calculation
is a compromise between the accuracy and the performance.
We have encountered that multi-threading usage that user creates,
multiple canvases owned by multiple user threads.
Current sw_engine memory pool has been considered only for multi-threads,
spawned by tvg task scheduler.
In this case it's safe but when user threads introduced, it can occur race-condition.
Thus, Here is a renewal policy that non-threading tvg(initialized threads with zero),
takes care of multiple user threads bu changing its policy,
each of canvases should have individual memory pool to guarantee mutual-exclusion.
@API additions
enum MempoolPolicy
{
Default = 0, ///< Default behavior that ThorVG is designed to.
Shareable, ///< Memory Pool is shared among the SwCanvases.
Individual ///< Allocate designated memory pool that is only used by current instance.
};
Result SwCanvas::mempool(MempoolPolicy policy) noexcept;
All in all, if user calls multiple threads, set memory pool policy to Individual.
In the fillFetchLinear function the offset parameter was removed.
The destination address may be shifted directly in the dst parameter,
it doesn't need to be passed separately.
previously alpha multiplying operation doesn't have perfect precision,
could loss 1 pixel since it divides 255 values by 256.
This improved operation comply with both precision & performance.
If ClipPath is a singular rectangle,
we don't need to apply this to all children nodes to adjust rle span regions.
Rather than its regular sequence,
we can adjust render region as merging viewport that is introduced internally,
All in all,
If a Paint has a single ClipPath that is Rectangle,
it sets viewport with Rectangle area that viewport is applied to
raster engine to cut off the rendering boundary.
In the normal case it brings trivial effects.
but when use SVGs which has a viewbox, it could increase the performance
up to 10% (profiled with 200 svgs rendering at the same time)
Note that, this won't be applied if the Paint has affine or rotation transform.
@Issues: 294
Description:
Crash was observed in examples when composite object was used.
It was caused because __m256i object was used on non aligned
memory to 32bit. Algorithm in this function was changed to use
unaligned __m256i_u object. Code was also simplified.
* sw_engine: adding a gradient as a stroke feature
Similarly as a shape may have a gradient fill so can the stroke.
* Capi: adding APIs for a gradient stroke
Co-authored-by: Hermet Park <hermetpark@gmail.com>
added a routine that draw non-transformed translucent image.
composition images will use this routine to draw faster.
Also added optimization point comments in raster image.
Renamed internal interfaces.
We need both blender & compositor interfaces.
Renamed SwCompositor -> SwBlender which is for pixel joining methods.
Added (SwCompositor, Compositor) which is designed for compositing images.
Introduce RendererMethod::renderRegion() to return acutal drawing region info.
That is used by scene composition to composite actual partial drawing region
for better performance.
@Issues: 173
There is 1 pixel misaligned issue observed.
Found out transform() increases 0.5 pt always.
This transform() logic was broken by this change - e00f948705
and now recorvered to origin.
* sw_engine raster: code refactoring & optimize code.
1. move the computation out of rolling if possible.
2. renamed internal variables & function prototypes.
Add RawLoader class that loads and display raw images,
and adds a Rasterizer for image data.
Image data can be loaded via picture.
Loaded image supports Composition, Transformation and Alpha blending.
New API
Result load(uint32_t* data, uint32_t width, uint32_t height, bool isCopy) noexcept;