在c release层面
| 分辨率 H*W 1280 720 | 耗时(MS) | 包含基本操作 | 
|---|---|---|
| 3*3 avg filter 普通写法 | 1.02 | 读写 +: 8 / :1 | 
| 5*5 avg filter 普通写法 | 2.54 | 读写 + 24 /:1 | 
| 3*3 winsum avg filter | 0.93 | 读写 +: 2 /: 1 other prepare for window | 
| 5*5 winsum avg filter | 0.94 | 读写 +: 2 /: 1 other prepare for window | 
| 2 to 1 down sample | 0.23 | 读写 | 
| 1/4 H*W upsample (video filter) | 1.52 | 读写 +:4 / :5 | 
| 1/16 H*W upsample(video filter) | 1.51 | 读写 +:4 / :5 | 
| 1/4 H*W upsample (specialRatio_bilinear up) | 0.60 | 读写 +:4 / :1 | 
| 1/16 H*W upsample (specialRatio_bilinear up) | 0.44 | 读写 +:4 / :1 | 
| 3*3 avg filter 1/4 HxW 普通写法 | 0.26 | |
| 5*5 avg filter 1/4 HxW winsum | 0.24 | 
手写单指令多数据 armv8 加速效果
| 分辨率 H*W 1280 720 | 耗时(MS) | 包含基本操作 | 
|---|---|---|
| 3*3 avg filter 普通写法 | ||
| 5*5 avg filter 普通写法 | ||
| 3*3 winsum avg filter | 0.13 | |
| 5*5 winsum avg filter | 0.14 | |
| 2 to 1 down sample | ||
| 1/4 H*W upsample (video filter) | ||
| 1/16 H*W upsample(video filter) | ||
| 1/4 H*W upsample (specialRatio_bilinear up) | ||
| 1/16 H*W upsample (specialRatio_bilinear up) | ||
| 3*3 avg filter 256x144 winsum | 0.01 | |
| 5*5 avg filter 256x144 winsum | 0.01 | 
单指令多数据常用操作 intrinsic
mla 乘加或者乘减比分开做要快
循环费时