One limitation of stylegan is that it generates a "pyramid" of images. The first layer makes a 4x4 image, which is upscaled and passed through the next layer (8x8), and so on, until out pops the final 1024x1024.
It's limiting because \
Jun 30, 2020 · 7:54 AM UTC
by the time you reach 32x32, the overall structure of the object is established (is this a face? is it a dog?) yet only the first 4 layers of the model were allowed to contribute to that decision! For a 1024x1024 model, that means 6 out of 10 layers of weights are irrelevant.
This may be why biggan is superior at modeling complex objects vs stylegan, since biggan doesn't use a pyramid at all.
I think the pyramid is worth keeping, but I propose a modification: have every slice of the pyramid pass through the entire model. Expensive, but maybe better.