State of the Slop

Every once in a while I pick a model and vibe up a slop to assess my need to resort to . I then put it here.

Note: I may update these projects after their initial published date using the same or similar models, and the complexity of these projects do not reflect the full capabilities of each model. Coding LLMs currently tend to excel at impressive demos, but fall short in quality, structure, details, consistency, and simple things over maintained projects.

  • Courses
    GPT-5.4, MiMo-V2-Pro, Qwen3.5-397B-A17B-FP8 (prompter)

    A placeholder showing how multiple projects share the same quarter section.

    "This is a weird one. With this I went more hands-off by prompting my OpenClaw (which itself is a vibe-coded slop) running Qwen3.5 to call Codex with GPT-5.4 medium, and then switched to OpenCode with MiMo-V2-Pro when that ran out of quota.

    When the switch happened, what the frontier GPT model left us was an absolute slop shit overengineered monorepo with a backend for account log-ins and "social features" and the worst stereotype-gradient-bluish-round-hover-border-slow-animations generated retard webapp feel you can think of with broken layouts and weird useless not-even-working pages and an inaccessible course itself (some fault in my initial prompting to make it a "platform"). Then MiMo turned it into the near-current minimalistic design with no extra features and a working course in... 1 turn.

    I mostly just focused on higher-level prompts to the claw and it then prompts the coding CLIs with more specific actionable feature requests in the AI-fasioned markdown format. I didn't really expect it to work this well... the taped-together double prompting. I later checked on the instructions it gave to the CLIs and they were very accurate, detailed, and high-effort from an AI whom I thought would have little understanding with the codebase itself.

    After some of the rounds, everything would seem to have turned into an unfixable mess with different truly unrelated things (for example broken dark mode, badly styled buttons, and exercise questions containing no stem) just broken or made no sense, where I thought would need intense hand-holding and explicit fixing rounds on all the broken parts I found. No, explicit prompting to fix one of those issues did nothing to fix that issue, and an unrelated iteration turn would somehow fix ALL of them and then everything is so back again. We truly don't understand modern technology.

    Still some level of manual checking of the codebase going on, but less."

    This site (styling only)

    Gemini 3.1 Pro

    State of the Slop index site

    "Still a bit infuriating to use. If not anything else, 3.1 Pro is pretty good at web dev aesthetics. A fair amount of hand-holding was involved in the design."

  • Gallery
    GPT-5 and possibly others

    Display many images/videos and fit them on one screen

    "A bit surprisingly, GPT-5 turned out to be much more pleasant to use than Opus 4.1 in practice (if I remembered the claude model I used correctly). The amount of pain is still at incomprehensible levels."