summaryrefslogtreecommitdiff
path: root/CPU-Benchmarks.md
blob: 953f8c2dccabf56ab0d7a8fe83f4928411fdc8f6 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
### Test your CPU: Run `gocryptfs -speed`

Originally written by [@lxp](https://github.com/lxp) in [Issue #23](https://github.com/rfjakob/gocryptfs/issues/23)

Older benchmarks have

* Benchmark4kEncStupidGCM = AES-GCM-256-OpenSSL,
* Benchmark4kEncGoGCM = AES-GCM-256-Go.

In recent gocryptfs versions you can run `gocryptfs -speed` to run the benchmarks and get nicer output.

The tests were run on `go version go1.6 linux/amd64` unless noted otherwise.

### 64-bit Intel/AMD (amd64) with AES-NI

**Kaby Lake (Launch: Q2'17)**
```
$ cat /proc/cpuinfo | grep -E "model name|flags" | head -2
model name	: Intel(R) Core(TM) i3-7130U CPU @ 2.70GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm arat pln pts hwp hwp_notify hwp_act_window hwp_epp flush_l1d
$ ./gocryptfs -version
gocryptfs v1.7-23-gcc0a603; go-fuse v1.0.0-186-g467f4e0; 2019-04-14 go1.12.4
$ ./gocryptfs -speed
AES-GCM-256-OpenSSL 	 877.83 MB/s	
AES-GCM-256-Go      	1905.48 MB/s	(selected in auto mode)
AES-SIV-512-Go      	 212.29 MB/s	
```


**Skylake (Launch: Q3'15)**
```
$ cat /proc/cpuinfo
model name	: Intel(R) Core(TM) i3-6100U CPU @ 2.30GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch ida arat epb pln pts dtherm hwp hwp_notify hwp_act_window hwp_epp intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-4	  200000	     10688 ns/op	 383.22 MB/s
Benchmark4kEncGoGCM-4    	  300000	      4073 ns/op	1005.57 MB/s
```
**Haswell (Launch: Q2'14)**
```
$ cat /proc/cpuinfo
model name	: Intel(R) Core(TM) i5-4690K CPU @ 3.50GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm xsaveopt dtherm ida arat pln pts
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-4	  200000	      6710 ns/op	 610.43 MB/s
Benchmark4kEncGoGCM-4    	  500000	      2422 ns/op	1690.86 MB/s
```

**Ivy Bridge (Launch: Q2'12)**
```
$ grep 'model name\|flags' /proc/cpuinfo | head -n2
model name	: Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb pti tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts
$ gocryptfs -version
gocryptfs v1.7-37-gb1468a7; go-fuse v1.0.0-174-g22a9cb9; 2019-06-11 go1.12 linux/amd64
$ gocryptfs -speed
AES-GCM-256-OpenSSL 	 546.39 MB/s	
AES-GCM-256-Go      	 828.67 MB/s	(selected in auto mode)
AES-SIV-512-Go      	 158.73 MB/s	
```

```
$ cat /proc/cpuinfo 
model name	: Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-4	  200000	     14684 ns/op	 278.94 MB/s
Benchmark4kEncGoGCM-4    	  300000	      7792 ns/op	 525.62 MB/s
```

**Sandy Bridge (Launch: Q1'11)**
```
$ cat /proc/cpuinfo 
model name	: Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-4	  100000	     19070 ns/op	 214.78 MB/s
Benchmark4kEncGoGCM-4    	  200000	     10981 ns/op	 373.01 MB/s
```

**Westmere (Launch: Q1'10)**
```
$ cat /proc/cpuinfo 
model name	: Intel(R) Xeon(R) CPU           E5620  @ 2.40GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm epb tpr_shadow vnmi flexpriority ept vpid dtherm ida arat
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-16        100000             18297 ns/op         223.85 MB/s
Benchmark4kEncGoGCM-16            200000              9579 ns/op         427.58 MB/s
```

### 64-bit Intel/AMD (amd64) without AES-NI

**Ivy Bridge (Launch: Q1'13)**
```
$ cat /proc/cpuinfo 
model name	: Intel(R) Pentium(R) CPU G2130 @ 3.20GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer xsave lahf_lm arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-2	  100000	     22691 ns/op	 180.51 MB/s
Benchmark4kEncGoGCM-2    	   20000	     92810 ns/op	  44.13 MB/s
```

**Sandy Bridge (Launch: Q3'11)**
```
$ grep 'model name\|flags' /proc/cpuinfo | head -n2
model name	: Intel(R) Pentium(R) CPU G630 @ 2.70GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer xsave lahf_lm epb tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm arat pln pts
$ ./gocryptfs -speed
AES-GCM-256-OpenSSL 	 175.80 MB/s	(selected in auto mode)
AES-GCM-256-Go      	  49.53 MB/s	
AES-SIV-512-Go      	  38.37 MB/s
```

**Nehalem (Launch: Q3'09)**
```
$ cat /proc/cpuinfo 
model name	: Intel(R) Xeon(R) CPU           X3460  @ 2.80GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt lahf_lm ida dtherm tpr_shadow vnmi flexpriority ept vpid
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-8	   50000	     35247 ns/op	 116.21 MB/s
Benchmark4kEncGoGCM-8    	   20000	     92230 ns/op	  44.41 MB/s
```

**Core (Launch: Q1'08)**
```
$ cat /proc/cpuinfo 
model name	: Intel(R) Core(TM)2 Duo CPU     E7400  @ 2.80GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm dtherm
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-2	   30000	     46697 ns/op	  87.71 MB/s
Benchmark4kEncGoGCM-2    	   10000	    194095 ns/op	  21.10 MB/s
```

### 64-bit ARM CPUs (arm64)

#### Apple M1 (Launch: Q4'2020)

From https://github.com/rfjakob/gocryptfs/issues/556#issuecomment-848079309

```
% gocryptfs -speed
gocryptfs v2.0-beta4-5-g09870bf; go-fuse v2.1.1-0.20210423170155-a90e1f463c3f => github.com/rfjakob/go-fuse/v2 v2.1.1-0.20210508151621-62c5aa1919a7; 2021-05-25 go1.16.3 darwin/arm64
AES-GCM-256-OpenSSL 	1627.09 MB/s	(selected in auto mode)
AES-GCM-256-Go      	3746.85 MB/s	
AES-SIV-512-Go      	 452.57 MB/s	
XChaCha20-Poly1305-Go	 747.43 MB/s	(benchmark only, not selectable yet)
```

#### Raspberry Pi 4 Model B (BCM2835, 4 x Cortex A72 @ 1.5GHz)

From https://github.com/rfjakob/gocryptfs/issues/531#issue-760624096 , Raspberry Pi 4b running ubuntu 20.10 64bit 
```
gocryptfs 1.8.0; go-fuse 2.0.3; 2020-11-27 go1.15.5 linux/arm64
AES-GCM-256-OpenSSL 	  21.44 MB/s	(selected in auto mode)
AES-GCM-256-Go      	  21.06 MB/s	
AES-SIV-512-Go      	  17.70 MB/s	
XChaCha20-Poly1305-Go	 122.86 MB/s	
```

### 32-bit ARM CPUs

#### Odroid XU4 (Exynos 5422 - ARM Cortex-A15 - 2 GHz)
```
model name      : ARMv7 Processor rev 3 (v7l)
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae
```

```
$ gocryptfs -speed
AES-GCM-256-OpenSSL       34.26 MB/s    (selected in auto mode)
AES-GCM-256-Go            17.24 MB/s
AES-SIV-512-Go            17.58 MB/s
```

From https://github.com/rfjakob/gocryptfs/issues/452#issuecomment-593334109 :
```
$ ./gocryptfs.xchacha20.armv7 --speed
AES-GCM-256-OpenSSL         N/A
AES-GCM-256-Go            17.04 MB/s    (selected in auto mode)
AES-SIV-512-Go            14.79 MB/s
XChaCha20-Poly1305-Go     23.37 MB/s
```

```
$ openssl speed -evp chacha20-poly1305 && openssl speed -evp aes-256-gcm
...
The 'numbers' are in 1000s of bytes per second processed.
type                 16 bytes    64 bytes     256 bytes    1024 bytes   8192 bytes   16384 bytes
chacha20-poly1305    64066.72k   130153.44k   275532.80k   306572.84k   320018.56k   307903.74k
aes-256-gcm          40323.87k   49980.74k    64734.47k    70323.03k    71862.66k    71786.19k
```

 
#### Raspberry Pi 3 B rev 1.2 (BCM2835 - ARM Cortex-A53 - 1.2Ghz)

(64-bit CPU running in 32-bit mode)

```
model name      : ARMv7 Processor rev 4 (v7l)
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
```

```
$ gocryptfs -speed
AES-GCM-256-OpenSSL       17.13 MB/s    (selected in auto mode)
AES-GCM-256-Go             5.27 MB/s
AES-SIV-512-Go             4.31 MB/s
```

```
$ openssl speed -evp chacha20-poly1305 && openssl speed -evp aes-256-gcm
...
The 'numbers' are in 1000s of bytes per second processed.
type                 16 bytes     64 bytes     256 bytes    1024 bytes   8192 bytes   16384 bytes
chacha20-poly1305    30020.39k    63560.13k    77169.32k    82019.33k    83536.55k    83645.78k
aes-256-gcm          16137.38k    19500.97k    20668.33k    20986.20k    21127.17k    21135.36k
```




#### Raspberry Pi B rev 2 (BCM2835 - ARM 11 - 700Mhz)
```
model name      : ARMv6-compatible processor rev 7 (v6l)
Features        : half thumb fastmult vfp edsp java tls
```

```
$ gocryptfs -speed
AES-GCM-256-OpenSSL        4.80 MB/s    (selected in auto mode)
AES-GCM-256-Go             1.85 MB/s
AES-SIV-512-Go             1.50 MB/s
```

```
$ openssl speed -evp chacha20-poly1305 && openssl speed -evp aes-256-gcm
...
The 'numbers' are in 1000s of bytes per second processed.
type                  16 bytes    64 bytes     256 bytes    1024 bytes   8192 bytes   16384 bytes
chacha20-poly1305     8090.97k    18202.65k    23222.03k    24960.34k    25666.44k    24958.29k
aes-256-gcm           4525.91k    6268.65k     6972.36k     7141.38k     7230.33k     7150.88k
```