A new paper suggests diminishing returns from larger and larger generative AI models. Dr Mike Pound discusses.

The Paper (No “Zero-Shot” Without Exponential Data): https://arxiv.org/abs/2404.04125

  • magic_lobster_party@kbin.run
    link
    fedilink
    arrow-up
    10
    arrow-down
    2
    ·
    2 months ago

    Improvements are made all the time. You can’t feed a very large SVM the same data as transformer networks and expect it to perform the same. Transformers are used because they can more easily learn complicated patterns with less data.

    I think I’ve read somewhere that neural networks with only one hidden layer can theoretically predict anything (if the hidden layer is large enough), but an incredible amount of data is required for it to do so, so it’s not practical.

    Over time other models will be discovered that can make better use of the training data.