How to set batchsize?

How to set batchsize to meet the size of different layer lengths of the gradient, because I found that if the batchsize is set very large, but the first layer gradient parameter is only a few hundred, the transmission is transmitted according to batchsize, resulting in very low transmission efficiency

This duplicates How to do dynamic batch packaging of plaintext? - #2 by ypolyakov