summaryrefslogtreecommitdiff
path: root/CPU-Benchmarks.md
blob: 534c9c8bbbd7b9d89f530abee92cee2986385900 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
Originally written by [@lxp](https://github.com/lxp) in [Issue #23](https://github.com/rfjakob/gocryptfs/issues/23)

Note that Benchmark4kEncStupidGCM = OpenSSL and Benchmark4kEncGoGCM = Go stdlib. In recent gocryptfs version you can run `gocryptfs -speed` to run the benchmark and get nicer output.

```
$ go version
go version go1.6 linux/amd64
```

### AES-NI
**Skylake (Launch: Q3'15)**
```
$ cat /proc/cpuinfo
model name	: Intel(R) Core(TM) i3-6100U CPU @ 2.30GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch ida arat epb pln pts dtherm hwp hwp_notify hwp_act_window hwp_epp intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-4	  200000	     10688 ns/op	 383.22 MB/s
Benchmark4kEncGoGCM-4    	  300000	      4073 ns/op	1005.57 MB/s
```
**Haswell (Launch: Q2'14)**
```
$ cat /proc/cpuinfo
model name	: Intel(R) Core(TM) i5-4690K CPU @ 3.50GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm xsaveopt dtherm ida arat pln pts
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-4	  200000	      6710 ns/op	 610.43 MB/s
Benchmark4kEncGoGCM-4    	  500000	      2422 ns/op	1690.86 MB/s
```

**Ivy Bridge (Launch: Q2'12)**
```
$ cat /proc/cpuinfo 
model name	: Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-4	  200000	     14684 ns/op	 278.94 MB/s
Benchmark4kEncGoGCM-4    	  300000	      7792 ns/op	 525.62 MB/s
```

**Sandy Bridge (Launch: Q1'11)**
```
$ cat /proc/cpuinfo 
model name	: Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-4	  100000	     19070 ns/op	 214.78 MB/s
Benchmark4kEncGoGCM-4    	  200000	     10981 ns/op	 373.01 MB/s
```

**Westmere (Launch: Q1'10)**
```
$ cat /proc/cpuinfo 
model name	: Intel(R) Xeon(R) CPU           E5620  @ 2.40GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm epb tpr_shadow vnmi flexpriority ept vpid dtherm ida arat
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-16        100000             18297 ns/op         223.85 MB/s
Benchmark4kEncGoGCM-16            200000              9579 ns/op         427.58 MB/s
```

### no AES-NI

**Ivy Bridge (Launch: Q1'13)**
```
$ cat /proc/cpuinfo 
model name	: Intel(R) Pentium(R) CPU G2130 @ 3.20GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer xsave lahf_lm arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-2	  100000	     22691 ns/op	 180.51 MB/s
Benchmark4kEncGoGCM-2    	   20000	     92810 ns/op	  44.13 MB/s
```

**Sandy Bridge (Launch: Q3'11)**
```
$ grep 'model name\|flags' /proc/cpuinfo | head -n2
model name	: Intel(R) Pentium(R) CPU G630 @ 2.70GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer xsave lahf_lm epb tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm arat pln pts
$ ./gocryptfs -speed
AES-GCM-256-OpenSSL 	 175.80 MB/s	(selected in auto mode)
AES-GCM-256-Go      	  49.53 MB/s	
AES-SIV-512-Go      	  38.37 MB/s
```

**Nehalem (Launch: Q3'09)**
```
$ cat /proc/cpuinfo 
model name	: Intel(R) Xeon(R) CPU           X3460  @ 2.80GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt lahf_lm ida dtherm tpr_shadow vnmi flexpriority ept vpid
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-8	   50000	     35247 ns/op	 116.21 MB/s
Benchmark4kEncGoGCM-8    	   20000	     92230 ns/op	  44.41 MB/s
```

**Core (Launch: Q1'08)**
```
$ cat /proc/cpuinfo 
model name	: Intel(R) Core(TM)2 Duo CPU     E7400  @ 2.80GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm dtherm
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-2	   30000	     46697 ns/op	  87.71 MB/s
Benchmark4kEncGoGCM-2    	   10000	    194095 ns/op	  21.10 MB/s
```