Throughput & scalability
Everyone claims to be scalable, but here we'll prove that Headjack can handle billions of accounts and anchor unlimited amounts of off-chain content tied to identity with simple napkin math.
How big is a Headjack transaction
Applications post anchors to off-chain content with an IPFS CID hash and a merkle root. IDMs also anchor off-chain content (mainly user preferences & updates to social graph), but they also post authorizations to other accounts (applications) to post on behalf of users as integer pairs.
So the fields for a transaction by an application/IDM (which will be the majority) are:
- version:
4 bytes
- signature:
65 bytes
- blob IPFS address:
32 bytes
- blob merkle root:
32 bytes
- nonce:
4 bytes
auto-increment integer associated with the account - to prevent reordering of anchored off-chain blobs (which would mess up internal addressing based on that nonce) - value:
4 bytes
amount of native token paid to validators for transaction inclusion
So far that is 141 bytes
which almost every transaction by an application or IDM contains. IDMs also submit a list of authorizations (or revocations) as integer pairs. For example, 1000 accounts authorizing 15 different applications to post on their behalf would be 1000 integer pairs. Assuming 8 byte integers (up to 2^64) that would be 8 2 1000 = 16k bytes.
Naive scenario
The initial version will target block bandwidth of up to 100 kb/s. This is not a problem for ZK validiums as there are already DA solutions that offer 10 mb/s or even much more.
Assuming:
- 1 MB block size & 10 second block time (100 kb/s of block bandwidth)
- 1000 applications posting in every block
- 100 IDMs authorizing as much users as possible - filling the remaining block space
- no on-chain actions such as keypair & name changes, account creation & direct interaction with the chain by end users
We get:
- 1100 actors (1000 applications + 100 IDMs) that post in every block at least
141
bytes for their transactions, which is155100
bytes - the remaining
893476
bytes (1048576 (1MB) - 155100) can be filled with authorizations and since an authorization is16
bytes (8 * 2) that would be 55842 authorizations/revocations every 10 seconds or 5584 authorizations/revocations per second - for 1 billion accounts that would be 0.557 authorizations/revocations per person per day which is actually quite good - people on average do way less single sign-ons per day
completely different goals - comparing the 2 protocols just to put things into perspective | Headjack | Ethereum |
---|---|---|
block size | 1 MB | ~80 kb |
block time | 10 seconds | ~13 seconds |
blockchain bandwidth per second | 100 kb/s (x16 more than Ethereum) | ~6.15 kb/s |
blockchain bandwidth per day | 8640 mb/d | ~528 mb/d |
transactions/authorizations per second | 5584 APS | ~14 TPS |
transactions/authorizations per day | 482,457,600 APS | 1,209,600 |
transactions/authorizations per person per day for 1 billion accounts | 0.482 (x400 more than Ethereum) | 0.0012096 |
Realistic scenario
The naive scenario does not include on-chain actions for specific accounts such as:
- keypair changes (new pubkey (32 bytes) + signature (65 bytes) if there is an older key)
- account creation (if done by an IDM then this is just a few bytes - no pubkey)
- name registration & ownership changes (see the dedicated page for more details)
- updating account fields such as a URI pointing towards an off-chain account directory (which could point to archived posts) or pointing to another account index for such services
- signed transactions by individual accounts that want to directly interact with the chain
- authorizing an IDM, rotating keys, or even publishing off-chain content as an application
However, the realistic scenario will not be far from the naive because:
- Only a % of all accounts will have keypairs (even though 100% could) and will make just a few signed actions per year - leaving most block throughput for authorizations through IDMs.
- Large % of accounts will rarely even be authorizing new applications - many people don't sign in to new services through SSO every single day. There could also be 2 types of log-ins: passive (viewing only - nothing on-chain) and authorized (allowing services to post on behalf of users).
- Many applications that don't generate a lot of off-chain activity will publish less often than on every block in order to minimize on-chain block space costs.
- The chain throughput can be further optimized & scaled by multiple orders of magnitude.
Optimizations & scaling
- Throughput of 100 kb/s is just the start & can easily go to 1-10 mb/s as a ZK rollup.
- The chain & state can be trivially sharded - there aren't problems such as fracturing liquidity or preventing composability because accounts don't care about each other - they mostly contain authorization block numbers & keypair history.
- Integer indexes that only need 4 bytes can be compressed/batched together - it'll take many years to go beyond 4 billion accounts so the actual throughput is 2x of what is listed here.
- A fee market can develop that tunes the cost of different actions so that actors don't just pay for on-chain bytes - the ways the system is used can be guided through incentives.
- Other optimizations not listed here - this is just the starting point.
State growth
Headjack's main value proposition is keeping historical records of the sequence of authorizations, key changes & off-chain content anchors and being able to generate proofs for any specific piece of off-chain content.
TODO: finish this
https://ethereum.stackexchange.com/questions/268/ethereum-block-architecture
numbers - state - one difference from other cryptos is that this one is append-only and could be designed to be easier on memory access patterns
One difference with other blockchains is that accounts in Headjack are numbers and thus the state tree could be different.
on eth state growth: https://twitter.com/SalomonCrypto/status/1587983584471633921 https://hackmd.io/@vbuterin/state_size_management
All on-chain changes just append data to one of the few attributes of:
- accounts:
- public keys: a map of keys and block height integer ranges (non-overlapping)
- authorizations: a map of indexes and arrays of block height integer ranges
- nonces: an array that maps autoincrement indexes to block numbers
- appended only when publishing off-chain content (usually an application/IDM)
- names:
- owners: a map of owner indexes and block height integer ranges (non-overlapping)
- nonces: an array that maps autoincrement indexes to account index & nonce pairs
- appended only when publishing off-chain content (usually an application/IDM)
TODO: should IPFS hashes & merkle roots be saved in the state?
- no?
TODO: light clients? in addition to merkle proofs for inclusion of content they would need merkle proofs for the state of which applications a user has authorized to post on their behalf in a given block
state growth: https://twitter.com/keoneHD/status/1574451986501623808
Off-chain content
There are no limits for off-chain content as it is all just anchored with merkle roots - it could be as high as hundreds of terabytes per second. There isn't a more minimal design that can link unbounded amounts of off-chain data to billions of identities that can change keys & names and yet still provide the guarantees & mental model simplicity of Headjack - it achieves consensus on the absolute bare minimum.