Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MaxBytes #2184

Merged
merged 7 commits into from
Aug 31, 2018
Merged

MaxBytes #2184

merged 7 commits into from
Aug 31, 2018

Conversation

melekes
Copy link
Contributor

@melekes melekes commented Aug 8, 2018

Refs #2035

  • Updated all relevant documentation in docs
  • Updated all code comments where relevant
  • Wrote tests
  • Updated CHANGELOG_PENDING.md

@melekes melekes requested review from ebuchman and xla as code owners August 8, 2018 12:28
_, err = cdc.UnmarshalBinaryReader(cs.ProposalBlockParts.GetReader(), &cs.ProposalBlock, int64(cs.state.ConsensusParams.BlockSize.MaxBytes))
maxDataSize := cs.state.ConsensusParams.BlockSize.MaxTxsBytes
// header + evidence + last commit
const maxOtherPartsSize = 5000000 // 5 MB
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is ugly. and it should be lower

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we ran through the numbers for how we got this size?
header will be 100 ed25519 sigs, plus a bunch of other stuff. We should have the exact size of last commit. Not sure how getting evidence size should work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also calculate the exact amino overhead. (Perhaps we make a method in amino to reflect through all the structs and tell you how big it would be if you were to marshall it)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be able to be more precise, and we should do the opposite (ie. subtract from maxBlockSize to get maxDataSize)

@@ -368,7 +383,8 @@ func (mem *Mempool) notifyTxsAvailable() {

// Reap returns a list of transactions currently in the mempool.
// If maxTxs is -1, there is no cap on the number of returned transactions.
func (mem *Mempool) Reap(maxTxs int) types.Txs {
// If maxTxsBytes is -1, there is no cap on the size of returned transactions.
func (mem *Mempool) Reap(maxTxs, maxTxsBytes int) types.Txs {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having maxTxs is convenient for testing and /unconfirmed_txs?limit=N endpoint, but it can be removed still. But then I am not sure what to do with /unconfirmed_txs?limit=N endpoint. Should we limit by total bytes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could just make a separate ReapTxs function, which should only be used by the endpoint and testing. I don't think that sort of code repitition will hurt us, as long as we don't use it in consensus.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea!

@@ -328,7 +328,7 @@ func TestConsensusParamsChangesSaveLoad(t *testing.T) {
params[0] = state.ConsensusParams
for i := 1; i < N+1; i++ {
params[i] = *types.DefaultConsensusParams()
params[i].BlockSize.MaxBytes += i
params[i].BlockSize.MaxTxsBytes += i
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what's going on here

@@ -206,7 +206,7 @@ message ConsensusParams {

// BlockSize contains limits on the block size.
message BlockSize {
int32 max_bytes = 1;
int32 max_txs_bytes = 1;
int32 max_txs = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove max_txs. The only utility of maximum txs is for the RPC end point now.

@@ -393,33 +393,34 @@ func (mem *Mempool) Reap(maxTxs, maxTxsBytes int) types.Txs {
time.Sleep(time.Millisecond * 10)
}

txs := mem.collectTxs(maxTxs, maxTxsBytes)
var cur int
txs := make([]types.Tx, 0, mem.txs.Len())
Copy link
Contributor

@ValarDragon ValarDragon Aug 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we will get a performance boost if we have a good estimate of avg size per tx, and set the initial capacity based off of that. (We could under-esitmate avg size per tx a bit) In the case of max=-1, we could just call reapTxs(-1) for simplicity.

i.e. Min(mem.txs.Len(), max / avgTxSize)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, but how big of a boost? It's better to make such optimizations after we know current performance profile. It may be that the time to calculate avg size will hurt us more (overweights gains).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point!

}

txs := make([]types.Tx, 0, cmn.MinInt(mem.txs.Len(), maxTxs))
for e := mem.txs.Front(); e != nil && len(txs) < maxTxs; e = e.Next() {
txs := make([]types.Tx, 0, mem.txs.Len())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we still do the min between mem.txs.Len() and max? (and keep the maxTxs == -1 check)

@@ -1328,6 +1336,13 @@ func (cs *ConsensusState) finalizeCommit(height int64) {

fail.Fail() // XXX

// update mempool filter because consensus params might have changed
cs.mempool.SetFilter(func(tx types.Tx) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea behind using a filter here that we can more easily use any consensus param, rather than just the maxbytes? The only other thing I can imagine us doing prelaunch is enforcing a minfee. I think both of these would be more well suited inside of the mempool itself. This would also avoid us having to use the mutex on the consensus param everytime a new tx is added. I think this wall cause blocking issues, plus the defer is a performance hit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more well suited inside of the mempool itself

but consensus params can be changed in any block (https://tendermint.com/docs/app-dev/abci-spec.html#endblock). you mean we should just copy them over if they change. That actually not a bad idea.

@ValarDragon
Copy link
Contributor

Do you want to enforce max gas in this pr as well, or should we save that for a separate pr?

@melekes
Copy link
Contributor Author

melekes commented Aug 9, 2018

Do you want to enforce max gas in this pr as well, or should we save that for a separate pr?

separate PR

@codecov-io
Copy link

codecov-io commented Aug 9, 2018

Codecov Report

Merging #2184 into develop will decrease coverage by 0.28%.
The diff coverage is 61.03%.

@@             Coverage Diff             @@
##           develop    #2184      +/-   ##
===========================================
- Coverage    61.14%   60.86%   -0.29%     
===========================================
  Files          196      196              
  Lines        16131    16201      +70     
===========================================
- Hits          9864     9860       -4     
- Misses        5411     5479      +68     
- Partials       856      862       +6
Impacted Files Coverage Δ
rpc/core/mempool.go 0% <0%> (ø) ⬆️
rpc/core/pipe.go 31.7% <0%> (ø) ⬆️
consensus/state.go 76.89% <100%> (-0.55%) ⬇️
evidence/pool.go 56.14% <100%> (ø) ⬆️
state/services.go 38.46% <38.46%> (ø) ⬆️
node/node.go 64.19% <50%> (-0.08%) ⬇️
mempool/mempool.go 69.35% <51.72%> (-2.27%) ⬇️
state/execution.go 75.25% <60%> (-0.26%) ⬇️
evidence/store.go 90.36% <75%> (-2.05%) ⬇️
libs/autofile/autofile.go 69.23% <0%> (-6.16%) ⬇️
... and 4 more

Copy link
Contributor

@ebuchman ebuchman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this started Anton. I did a cursory review. We probably should have started with an ADR here though. Would you mind writing one before we continue working on the code ?

int32 max_bytes = 1;
int32 max_txs = 2;
int64 max_gas = 3;
int32 max_txs_bytes = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should try to keep these as the actual block size - it includes commits and evidence and the header. We should have upper bounds for each of those that we can use to determine the max_tx_bytes

@@ -153,6 +153,13 @@ func NewConsensusState(
cs.setProposal = cs.defaultSetProposal

cs.updateToState(state)

// filter mempool txs based on consensus params
maxBytes := cs.state.ConsensusParams.TxSize.MaxBytes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i feel like it would be much better if this kind of thing were set higher up, rather than being a weird side effect of the consensus state. can this be done from the node.go?

I see we might need this because ConsensusParams might change. Maybe we can pass it in through Update instead ? better to concentrate the "changes due to a block commit" in one place

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to concentrate the "changes due to a block commit" in one place

agree

You mean

if err := blockExec.mempool.Update(block.Height, block.Txs); err != nil {
, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes - if there's more parameters the mempool will depend on beyond the height, we should pass those in there too

_, err = cdc.UnmarshalBinaryReader(cs.ProposalBlockParts.GetReader(), &cs.ProposalBlock, int64(cs.state.ConsensusParams.BlockSize.MaxBytes))
maxDataSize := cs.state.ConsensusParams.BlockSize.MaxTxsBytes
// header + evidence + last commit
const maxOtherPartsSize = 5000000 // 5 MB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be able to be more precise, and we should do the opposite (ie. subtract from maxBlockSize to get maxDataSize)

@melekes
Copy link
Contributor Author

melekes commented Aug 13, 2018

We probably should have started with an ADR here though

agree. wanted to throw some code in for discussion, but it took longer than expected. will write an ADR

@melekes melekes requested a review from zramsay as a code owner August 13, 2018 11:59

- MaxHeaderBytes - 512 bytes (~200 bytes hashes + 200 bytes - 50 UTF-8 encoded symbols of chain ID)
- MaxEvidenceBytes - 4096 bytes (10 DuplicateVoteEvidence ~350 bytes each)
- MaxLastCommitBytes - 163840 (160 KB) (1000 votes ~150 bytes each)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this one should be a parameter. Seems completely determined by number of validators and tendermint protocol right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, but we still want to limit the block size. We know that TM currently does not support more than 300 validators (even less), so 1000 should be fine.

Copy link
Contributor

@ValarDragon ValarDragon Aug 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not go the other way around?

i.e. Set a MaxBlockSize, and then derive tx size as follows:

MaxLastCommitBytes = function of number of validators currently enabled
MaxEvidenceBytes = function of consensus param: "maxNumEvidences"
MaxHeaderBytes = a consensus param

Then MaxTxSize = MaxBlockSize - amino overhead - MaxLastCommitBytes - MaxEvidenceBytes - MaxHeaderBytes. This makes MaxBlockSize the constant in consensus, not commit / tx size. (Max TxSize would then decrease automatically as num validators increases)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

function of consensus param: "maxNumEvidences"

and MaxEvidenceBytes - max size of a single evidence

function of number of validators currently enabled

and MaxVoteBytes, right?

This makes MaxBlockSize the constant in consensus, not commit / tx size. (Max TxSize would then decrease automatically as num validators increases)

Right, so it's either a) we have a sum and block size is growing when number of validators increasing OR b) we have a fixed BlockSize.MaxBytes and number of txs is decreasing while number of validators increases

Copy link
Contributor

@ValarDragon ValarDragon Aug 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the latter, fixed block size and space for txs decreases as validators increases. Fixing the block size rather than tx data size feels more natural to me, and allows for better predictions of storage requirements, hence my preference for it.

MaxEvidenceBytes and MaxVoteBytes would be calculated values, not part of the config right?


Proposed default values are:

- MaxHeaderBytes - 512 bytes (~200 bytes hashes + 200 bytes - 50 UTF-8 encoded symbols of chain ID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably my lack of familiarity, but why are there 200 bytes of hashes? I thought it would just be one merkelized hash at the root of the header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nah, tendermint header will be identical to ABCI header

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your right, thanks for the clarification! (I hadn't read through the ABCI header definition before, this makes sense now)

Proposed default values are:

- MaxHeaderBytes - 512 bytes (~200 bytes hashes + 200 bytes - 50 UTF-8 encoded symbols of chain ID)
- MaxEvidenceBytes - 4096 bytes (10 DuplicateVoteEvidence ~350 bytes each)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we make this maximum "MaxEvidencesPerBlock", and then derive the assoc. bytes directly from that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

@melekes
Copy link
Contributor Author

melekes commented Aug 15, 2018

Updated ADR.


1) Get rid of MaxTxs.
2) Rename MaxTxsBytes to MaxBytes.
3) Add MaxHeaderBytes cs parameter (TODO: /~https://github.com/tendermint/tendermint/blob/8a1a79257e8e8b309cd35bb1fe40bf9b3330fd7d/types/block.go#L195)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused as to why this is a consensus parameter.

AFAICT, everything here can be upper bounded from type definition:

tendermint/types/block.go

Lines 197 to 222 in 8a1a792

type Header struct {
// basic block info
ChainID string `json:"chain_id"`
Height int64 `json:"height"`
Time time.Time `json:"time"`
NumTxs int64 `json:"num_txs"`
TotalTxs int64 `json:"total_txs"`
// prev block info
LastBlockID BlockID `json:"last_block_id"`
// hashes of block data
LastCommitHash cmn.HexBytes `json:"last_commit_hash"` // commit from validators from the last block
DataHash cmn.HexBytes `json:"data_hash"` // transactions
// hashes from the app output from the prev block
ValidatorsHash cmn.HexBytes `json:"validators_hash"` // validators for the current block
NextValidatorsHash cmn.HexBytes `json:"next_validators_hash"` // validators for the next block
ConsensusHash cmn.HexBytes `json:"consensus_hash"` // consensus params for current block
AppHash cmn.HexBytes `json:"app_hash"` // state after txs from the previous block
LastResultsHash cmn.HexBytes `json:"last_results_hash"` // root hash of all results from the txs from the previous block
// consensus info
EvidenceHash cmn.HexBytes `json:"evidence_hash"` // evidence included in the block
ProposerAddress Address `json:"proposer_address"` // original proposer of the block
}

All of the hashes are tmhashes, time.Time surely has a max size, we can bound chain-ID (256 bits b64 encoded? This is moreso dependent on what our plan for chain-ID is). We can use the max size for "int" in PartSetHeader (used within BlockID). So we should be able to derive the MaxHeaderBytes just from its definition. We can just amino encode it as well, so we don't have to guess at the amino overhead for the header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Thanks.

@ValarDragon
Copy link
Contributor

ADR looks great to me! Thanks for writing it!

@@ -21,7 +21,7 @@ func TestConsensusParams(t *testing.T) {

// get values with real fields
assert.NotNil(params.GetBlockSize())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I understand the point of this test

@@ -194,9 +194,9 @@ func newMockEvidencePool(val []byte) *mockEvidencePool {
}
}

func (m *mockEvidencePool) PendingEvidence() []types.Evidence {
func (m *mockEvidencePool) PendingEvidence(limit int) []types.Evidence {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be uint really, but not sure if it's the right time

evidence/pool.go Outdated
@@ -58,8 +58,8 @@ func (evpool *EvidencePool) PriorityEvidence() []types.Evidence {
}

// PendingEvidence returns all uncommitted evidence.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update description

@@ -87,8 +88,9 @@ func (store *EvidenceStore) PriorityEvidence() (evidence []types.Evidence) {
}

// PendingEvidence returns all known uncommitted evidence.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update description

@@ -57,6 +57,8 @@ var (
ErrMempoolIsFull = errors.New("Mempool is full")
)

// type Filter func(types.Tx) bool
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

types/block.go Outdated
MaxHeaderBytes = 476

// AminoOverheadForBlock - amino overhead to encode the block (whole block + data field).
AminoOverheadForBlock = 5
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need a script to get these values?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the very least a test should fail if the Header changes and MaxHeaderBytes doesn't change with it.

@melekes
Copy link
Contributor Author

melekes commented Aug 17, 2018

TODO: limit chain ID (since now we have MaxHeaderBytes, calculated with max 50 UTF-8 symbols) to 50 symbols max

Copy link
Contributor

@ebuchman ebuchman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments, not a thorough review

int32 max_txs = 2;
int64 max_gas = 3;
int64 max_gas = 2;
int32 max_num_evidences = 3;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to expose this in here? maybe we can set some internal heuristic and let all just be bounded by max_bytes ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by this? Don't we want this to be a consensus parameter, and since it pertains to parts of a block size it belongs in this object?

If I understand correctly, your suggesting we have a function inside of the code that calculates max_num_evidence = f(max_bytes). This seems harder to configure, and as a user of the protocol I would want an easy way to alter this function. (At which point it would then have to go back into the config)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems harder to configure, and as a user of the protocol I would want an easy way to alter this function.

Do you really though? Seems like it should basically be as large as possible - if there's evidence, Tendermint should prioritize including it. Not clear that it's really something that needs to be exposed to users right now.

@@ -153,6 +153,13 @@ func NewConsensusState(
cs.setProposal = cs.defaultSetProposal

cs.updateToState(state)

// filter mempool txs based on consensus params
maxBytes := cs.state.ConsensusParams.TxSize.MaxBytes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes - if there's more parameters the mempool will depend on beyond the height, we should pass those in there too


```
MaxLastCommitBytes = {number of validators currently enabled} * {MaxVoteBytes}
MaxEvidenceBytes = {MaxNumEvidences} * {MaxEvidenceBytes}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i feel like we can set an internal MaxEvidenceBytes and check if evidence is available, take up to that amount, and then subtract precisely the amount of evidence we're including rather than potentially leaving a lot of room here with every block (since most blocks will have no evidence)

Copy link
Contributor

@ValarDragon ValarDragon Aug 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to do that, we would have to acquire the evidence before reaping the txs, or do 2 tx reaps. (The second after evidence is included) I'm not opposed to either of those possibilities.

Setting MaxEvidenceBytes vs MaxNumEvidences is independent of that however. I do think MaxNumEvidences is the better variable in the config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than potentially leaving a lot of room here with every block

Yeah, did not think of that. Thank you! Let's do that.

In order to do that, we would have to acquire the evidence before reaping the txs, or do 2 tx reaps.

we can do it in one pass

txs := cs.mempool.ReapMaxBytes(maxDataBytes(cs.state.Validators.Size(), &cs.state.ConsensusParams.BlockSize))
evidence := cs.evpool.PendingEvidence(cs.state.ConsensusParams.BlockSize.MaxNumEvidences)


### Negative

* Neccessity to configure additional variables
Copy link
Contributor

@ebuchman ebuchman Aug 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will actually reduce the need to configure more variables potentially!

types/block.go Outdated
MaxHeaderBytes = 476

// AminoOverheadForBlock - amino overhead to encode the block (whole block + data field).
AminoOverheadForBlock = 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the very least a test should fail if the Header changes and MaxHeaderBytes doesn't change with it.

@ebuchman
Copy link
Contributor

We should probably add something about gas to the ADR too so that we limit by both bytes and gas (which ever limit is reached first)

@xla xla added the C:abci Component: Application Blockchain Interface label Aug 27, 2018
@@ -946,14 +947,25 @@ func (cs *ConsensusState) createProposalBlock() (block *types.Block, blockParts
return
}

maxBytes := cs.state.ConsensusParams.BlockSize.MaxBytes
// bound evidence to 1/10th of the block
evidence := cs.evpool.PendingEvidence(maxBytes / 10)
Copy link
Contributor Author

@melekes melekes Aug 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: incoming block evidence is unbound (same as other parts)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should state this ratio more explicitly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spec for how we construct a block?

// It is wrapped by PriorityEvidence and PendingEvidence for convenience.
func (store *EvidenceStore) ListEvidence(prefixKey string) (evidence []types.Evidence) {
// If maxBytes is -1, there's no cap on the size of returned evidence.
func (store *EvidenceStore) ListEvidence(prefixKey string, maxBytes int) (evidence []types.Evidence) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make private?

@@ -118,7 +120,7 @@ func TestBlockMakePartSetWithEvidence(t *testing.T) {

partSet := MakeBlock(h, txs, commit, evList).MakePartSet(1024)
assert.NotNil(t, partSet)
assert.Equal(t, 3, partSet.Total())
assert.Equal(t, 2, partSet.Total())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

??

types/tx.go Outdated
@@ -11,6 +11,11 @@ import (
cmn "github.com/tendermint/tendermint/libs/common"
)

const (
// MaxAminoOverheadForTx - amino overhead to encode a transaction.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maximum amino overhead seems to range from 1 to 4 bytes

types/block.go Outdated
MaxHeaderBytes = 478

// MaxAminoOverheadForBlock - maximum amino overhead to encode a block (up to
// MaxBlockSizeBytes in size).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not including it's parts (only varint len + fields without data)

types/block.go Outdated

// MaxAminoOverheadForBlock - maximum amino overhead to encode a block (up to
// MaxBlockSizeBytes in size).
MaxAminoOverheadForBlock = 4
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verify

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems way too optimistic imo. We have an extra byte of overhead for every field (field numbers). The Txs are in the block as well, so each one of those will increase the total overhead. We'd have to calculate a maximum number of txs to create the MaxAminoOverheadForBlock

Copy link
Contributor Author

@melekes melekes Aug 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know. Should be 4 + 1 byte for each field probably ~ 8

We'd have to calculate a maximum number of txs to create the MaxAminoOverheadForBlock

no. see what I've done with MaxAminoOverheadForTx. In short: when we add txs, we count len(tx)+MaxAminoOverheadForTx

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still seems a bit complex. If you or anybody else have ideas how to simplify this stuff, I'd be happy to hear them

types/tx.go Outdated
const (
// MaxAminoOverheadForTx - maximum amino overhead to encode a transaction
// (ranges from 1 to 4 bytes).
MaxAminoOverheadForTx = 4
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verify

txs := make([]types.Tx, 0, mem.txs.Len())
for e := mem.txs.Front(); e != nil; e = e.Next() {
memTx := e.Value.(*mempoolTx)
if max > 0 && cur+len(memTx.tx)+types.MaxAminoOverheadForTx > max {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it does not have to be Max. We can create a amino#EstimateSize(o interface{}) function

@melekes melekes changed the title [WIP] MaxBytes MaxBytes Aug 31, 2018
Copy link
Contributor

@ebuchman ebuchman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can merge this but there's a few things we should address before release - especially that it seems the numbers aren't quite right

@@ -946,14 +947,25 @@ func (cs *ConsensusState) createProposalBlock() (block *types.Block, blockParts
return
}

maxBytes := cs.state.ConsensusParams.BlockSize.MaxBytes
// bound evidence to 1/10th of the block
evidence := cs.evpool.PendingEvidence(maxBytes / 10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should state this ratio more explicitly

mempool/mempool.go Show resolved Hide resolved
mempool/mempool.go Show resolved Hide resolved
mempool/mempool.go Show resolved Hide resolved
@@ -145,7 +145,9 @@ func (blockExec *BlockExecutor) Commit(block *types.Block) ([]byte, error) {
"appHash", fmt.Sprintf("%X", res.Data))

// Update mempool.
if err := blockExec.mempool.Update(block.Height, block.Txs); err != nil {
maxBytes := state.ConsensusParams.TxSize.MaxBytes
filter := func(tx types.Tx) bool { return len(tx) <= maxBytes }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better to define this and export it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also do we want to allow a negative or zero MaxBytes to indicate unbounded?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we're not checking TxSize.MaxBytes for txs in the block, just in the mempool. Technically that would make it not a consensus param, because a proposer running different mempool code could include txs larger than TxSize.MaxBytes

I wonder if we even need a TxSize.MaxBytes in the consensus params - it might be simpler to just use the computed maxDataBytes.

We'd still need Update to take a filter, but we'd be able to remove the TxSize configuration and just do it all through blocks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think making TxSize.MaxBytes a consensus parameter should be based on how we important we feel guarantees about the total number of transactions that can fit into a block is. (Thats the only advantage I see to making it a consensus parameter)

If we're unsure, I'd opt to include it if implementation of it wouldn't be hard. I like the idea of being able to make guarantees about the number of txs which can fit into a block, though I can't find a super convincing reason as to why.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though I can't find a super convincing reason as to why.

Right. We have a lot of surface area. When we don't have a really good reason to have something, it's probably worth removing it. Helps simplify things over all and reduce the possibility for errors.

Copy link
Contributor Author

@melekes melekes Sep 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better to define this and export it.

That was an initial implementation, but the problem is in

Update(height int64, txs types.Txs, filter func(types.Tx) bool) error
it's undesirable to import mempool package (just for mempl.Filter). Am I wrong?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd still need Update to take a filter, but we'd be able to remove the TxSize configuration and just do it all through blocks.

Should this be done along with enforcing MaxGas?

Height: 10,
Time: time.Now().UTC(),
NumTxs: 100,
TotalTxs: 200,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't these varints? What if we had TotalTxs: 20000 - then the test fails because the header is 479 bytes

const chainID = "mychain"
ev := &DuplicateVoteEvidence{
PubKey: secp256k1.GenPrivKey().PubKey(), // use secp because it's pubkey is longer
VoteA: makeVote(val, chainID, 0, 10, 2, 1, blockID),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@@ -248,6 +248,8 @@ func NewNode(config *cfg.Config,
mempl.WithMetrics(memplMetrics),
)
mempool.SetLogger(mempoolLogger)
maxBytes := state.ConsensusParams.TxSize.MaxBytes
mempool.SetFilter(func(tx types.Tx) bool { return len(tx) <= maxBytes })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we make this a mempool option and set it on the constructor instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to add to the mempool config.

@ebuchman ebuchman merged commit ffe91ae into develop Aug 31, 2018
@ebuchman ebuchman deleted the 2035-max-bytes branch August 31, 2018 19:16
This was referenced Aug 31, 2018
melekes added a commit that referenced this pull request Sep 3, 2018
this mirrors ReapMaxBytes behavior

See #2184 (comment)
melekes added a commit that referenced this pull request Sep 3, 2018
melekes added a commit that referenced this pull request Sep 3, 2018
melekes added a commit that referenced this pull request Sep 3, 2018
melekes added a commit that referenced this pull request Sep 4, 2018
* ReapMaxTxs: return all txs if max is negative

this mirrors ReapMaxBytes behavior

See #2184 (comment)

* increase MaxAminoOverheadForBlock

tested with:

```
func TestMaxAminoOverheadForBlock(t *testing.T) {

        maxChainID := ""
        for i := 0; i < MaxChainIDLen; i++ {
                maxChainID += "𠜎"
        }

        h := Header{
                ChainID:            maxChainID,
                Height:             10,
                Time:               time.Now().UTC(),
                NumTxs:             100,
                TotalTxs:           200,
                LastBlockID:        makeBlockID(make([]byte, 20), 300, make([]byte, 20)),
                LastCommitHash:     tmhash.Sum([]byte("last_commit_hash")),
                DataHash:           tmhash.Sum([]byte("data_hash")),
                ValidatorsHash:     tmhash.Sum([]byte("validators_hash")),
                NextValidatorsHash: tmhash.Sum([]byte("next_validators_hash")),
                ConsensusHash:      tmhash.Sum([]byte("consensus_hash")),
                AppHash:            tmhash.Sum([]byte("app_hash")),
                LastResultsHash:    tmhash.Sum([]byte("last_results_hash")),
                EvidenceHash:       tmhash.Sum([]byte("evidence_hash")),
                ProposerAddress:    tmhash.Sum([]byte("proposer_address")),
        }
        b := Block{
                Header:     h,
                Data:       Data{Txs: makeTxs(10000, 100)},
                Evidence:   EvidenceData{},
                LastCommit: &Commit{},
        }

        bz, err := cdc.MarshalBinary(b)
        require.NoError(t, err)

        assert.Equal(t, MaxHeaderBytes+MaxAminoOverheadForBlock-2, len(bz)-1000000-20000-1)
}
```

* fix MaxYYY constants calculation

by using math.MaxInt64

See #2184 (comment)

* pass mempool filter as an option

See #2184 (comment)

* fixes after Dev's comments
@ebuchman ebuchman mentioned this pull request Sep 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C:abci Component: Application Blockchain Interface C:consensus Component: Consensus C:evidence Component: Evidence
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants