Claude Code is locking people out for hours

(github.com)

196 points | by sh1mmer 2 hours ago

45 comments

xantronix 1 hour ago
As much as people on Hacker News complain about subscription models for productivity and creativity suites, the open arms embrace of subscription development tools (services, really) which seek to offload the very act itself makes me wonder how and why so many people are eager to dive right in. I get it. LLMs are cool technology.
Is this a symptom of the same phenomenon behind the deluge of disposable JavaScript frameworks of just ten years ago? Is it peer pressure, fear of missing out? At its root, I suspect so; of course I would imagine it's rare for the C-suite to have ever mandated the usage of a specific language or framework, and LLMs represent an unprecedented lever of power to have an even bigger shot at first mover's advantage, from a business perspective. (Yes, I am aware of how "good enough" local models have become for many.)
I don't really have anything useful nor actionable to say here regarding this dialling back of capability to deal with capacity issues. Are there any indications of shops or individual contributors with contingency plans on the table for dialling back LLM usage in kind to mitigate these unknowns? I know the calculus is such that potential (and frequently realised) gains heavily outweigh the risks of going all in, but, in the grander scheme of time and circumstance, long term commitments are starting to be more apparently risky. I am purposefully trying to avoid "begging the question" here; if instead of LLMs, this were some other tool or service, reactions to these events would have been far more pragmatic, with less of a reticence to invest time on in-house solutions when dealing with flaky vendors.
[-]
- rurp 1 hour ago
  HN is a big community that has always had a mix of people who value newness as a feature vs those who prioritize simplicity and reliability. Unless you're recognizing the exact same names taking these contradictory opinions it's probably different groups of people for the most part.
  It seems like every LLM thread for the past couple years is full of posts saying that the latest hot AI tool/approach has made them unbelievably more productive, followed by others saying they found that same thing underwhelming.
  [-]
  - echelon 1 hour ago
    > I get it. LLMs are cool technology.
    I don't think many of you have legitimately tried Claude Code, or maybe you're holding it wrong.
    I'm getting 10x the work done. I'm operating at all layers of the stack with a speed and rapidity I've never had before.
    And before anyone accuses me of being some "vibe coder", I've built five nines active-active money rails that move billions of dollars a day at 50kqps+, amongst lots of other hard hitting platform engineering work. Serious senior engineering for over a decade.
    This isn't just a "cool technology". We've exited the punch card phase. And that is hard or impossible to come back from.
    If you're not seeing these same successes, I legitimately think you're using it wrong.
    I honestly don't like subscription services, hyperscaler concentration of power, or the fact I can't run Opus locally. But it doesn't matter - the tool exists in the shape it does, and I have to consume it in the way that it's presented. I hope for a different offering that is more democratic and open, but right now the market hasn't provided that.
    It's as if you got access to fiber or broadband and were asked to go back to ISDN/dial up.
    [-]
    - nerptastic 1 hour ago
      Man I really thought this was satire. It’s phenomenal that you can gain 10x benefits at all layers of the stack, you must have a very small development team or work alone.
      I just don’t see how I could export 10x the work and have it properly validated by peers at this point in time. I may be able to generate code 10-20x faster, but there are nuances that only a human can reason about in my particular sector.
      [-]
      - suzzer99 14 minutes ago
        Senior engineer with 25 years of experience here. I wish I spent enough time actually coding that 10x-ing my coding productivity would matter much to my job. Most of my day is spent wrangling requirements, looking after junior devs, stamping out confusion brush fires, and generally just trying to steer the app away from a trainwreck down the line.
        When I do code, it's almost always something novel that I don't know how I'm going to do until I code a few pieces and see how they fit together. If it's a fairly routine feature based on an existing pattern, I assign it to one of the other devs.
      - nothinkjustai 6 minutes ago
        It is satire! They have been doing this bit for a while and people keep falling for it lol
      - hsuduebc2 1 hour ago
        I noticed that too. At start. It vaguely reminded me of the famous Navy SEAL copypasta.
    - dandellion 1 hour ago
      You must be using it wrong, because I'm getting 100x the work done and currently at 1.5 million MRR with this SAAS I vibe coded over the weekend.
      After I solved entrepreneurship I decided to retire and I now spend my days reading HN, posting on topics about AI.
      [-]
      - darth_aardvark 55 minutes ago
        You're still manually posting? All of my HN posting, trolling, shitposting and spamming is taken care of by a fleet of bots I vibecoded in the last 5 minutes.
        [-]
        slowmovintarget 7 minutes ago
        You gest, but I know people who've done this.
        "I gotta be present." Me: Reenacting the Malcolm Reynolds too many responses meme.
    - dwaltrip 31 minutes ago
      It’d be cool to see your process in depth. You should record some of your sessions :)
      I mostly believe you. I have seen hints of what you are talking about.
      But often times I feel like I’m on the right track but I’m actually just spinning when wheels and the AI is just happily going along with it.
      Or I’m getting too deep on something and I’m caught up in the loop, becoming ungrounded from the reality of the code and the specific problem.
      If I notice that and am not too tired, I can reel it back in and re-ground things. Take a step back and make sure we are on reasonable path.
      But I’m realizing it can be surprisingly difficult to catch that loop early sometimes. At least for me.
      I’ve also done some pretty awesome shit with it that either would have never happened or taken far longer without AI — easily 5x-10x in many cases. It’s all quite fascinating.
      Much to learn. This idea is forming for me that developing good “AI discipline” is incredibly important.
      P.s. sometimes I also get this weird feeling of “AI exhaustion”. Where the thought of sending another prompt feels quite painful. The last week I’ve felt that a lot.
      P.p.s. And then of course this doesn’t even touch on maintaining code quality over time. The “after” part when the LLM implements something. There are lots of good patterns and approaches for handling this, but it’s a distinct phase of the process with lots of complexities and nuances. And it’s oh-so-temping to skip or postpone. More so if the AI output is larger — exactly when you need it most.
    - Aurornis 13 minutes ago
      I use Claude Code a lot, but I don't understand these "I'm doing 10X the work" comments.
      I spend a lot of time reviewing any code that comes out of Claude Code. Even using Opus 4.6 with max effort there is almost always something that needs to be changed, often dramatically.
      I can see how people go down the path of thinking "Wow, this code compiles and passes my tests! Ship it!" and start handing trust over to Opus, but I've already seen what this turns into 6 months down the road: Projects get mired down in so much complexity and LLM spaghetti that the codebase becomes fragile. Everyone is sidetracked restructuring messy code from the past, then fighting bugs that appear in the change.
      I can believe some of the more recent studies showing LLMs can accelerate work by circa 20% (1.2X) because that's on the same order of magnitude that I and others are seeing with careful use.
      When someone comes out and claims 10X more output, I simply cannot believe they're doing careful engineering work instead of just shipping the output after a cursory glance.
    - xantronix 34 minutes ago
      Mind if I use this as a copypasta for the future? This checks off every point people bring on LinkedIn and elsewhere.
      In all seriousness though, writing code, or even sitting down and properly architecting things, have never been bottlenecks for me. It has either been artificial deadlines preventing me from writing proper unit tests, or the requirement for code review from people on my team who don't even work on the same codebase as I do on a daily basis. I have often stated and stand by the assertion that I develop at the speed of my own understanding, and I think that is a good virtue to carry forth that I think will stand the test of time and bring about the best organisational outcomes. It's just a matter of finding the right place that values this approach.
      [-]
      - mikebenfield 17 minutes ago
        > artificial deadlines preventing me from writing proper unit tests, or the requirement for code review from people on my team who don't even work on the same codebase as I do on a daily basis
        I have never experienced this, and it sounds remarkably dysfunctional to me.
    - ericmcer 1 hour ago
      I mean at this point can we just conclude that there are a group of engineers who claim to have incredible success with it and a group that claim it is unreliable and cannot be trusted to do complex tasks.
      I struggle to believe that a ton of seemingly intelligent software engineers are too dumb to figure out how to use Claude code to get reliable results, it seems much more likely to me that it can do well at isolated tasks or new projects but fails when pointed at large complex code bases because it just... is a token predictor lol.
      But yeah spinning up a green fields project in an extensively solved area (ledgers) is going to be something an AI shines at.
      It isn't like we don't use this stuff also, I ask Cursor to do things 20x a day and it does something I don't like 50% of the time. Even things like pasting an error message it struggles with. How do I reconcile my actual daily experience with hype messages I see online?
      [-]
      - dns_snek 9 minutes ago
        > But yeah spinning up a green fields project in an extensively solved area (ledgers) is going to be something an AI shines at.
        I couldn't disagree with this more. It's impressive at building demos, but asking it to build the foundation for a long-term project has been disastrous in my experience.
        When you have an established project and you're asking it to color between the lines it can do that well (most of the time), but when you give it a blank canvas and a lot of autonomy it will likely end up generating crap code at a staggering pace. It becomes a constant fight against entropy where every mess you don't clean up immediately gets picked up as "the way things should be done" the next time.
        Before someone asks, this is my experience with both Claude Code (Sonnet/Opus 4.6) and Codex (GPT 5.4).
      - rurp 45 minutes ago
        Right, I keep seeing people talking past each other in this same way. I don't doubt folks when they say they coded up some greenfield project 10x faster with Claude, it's clearly great at many of those tasks! But then so many of them claim that their experience should translate to every developer in every scenario, to the point of saying they must be using it wrong if they aren't having the same experience.
        Many software devs work in teams on large projects where LLMs have a more nuanced value. I myself mostly work on a large project inside a large organization. Spitting out lines of code is practically never a bottleneck for me. Running a suite of agents to generate out a ton of code for my coworkers to review doesn't really solve a problem that I have. I still use Claude in other ways and find it useful, but I'm certainly not 10x more productive with it.
      - hombre_fatal 1 hour ago
        I suspect many people here have tried it, but they expected it to one-shot any prompt, and when it didn't, it confirmed what they wanted to be true and they responded with "hah, see?" and then washed their hands of it.
        So it's not that they're too stupid. There are various motivations for this: clinging on to familiarity, resistance to what feels like yet another tool, anti-AI koolaid, earnestly underwhelmed but don't understand how much better it can be, reacting to what they perceive to be incessant cheerleading, etc.
        It's kind of like anti-Javascript posts on HN 10+ years ago. These people weren't too stupid to understand how you could steelman Node.js, they just weren't curious enough to ask, and maybe it turned out they hadn't even used Javascript since "DHTML" was a term except to do $(".box").toggle().
        I wish there were more curiosity on HN.
        [-]
        ericmcer 51 minutes ago
        So what do I do differently then?
        Hypothetically, you have a simple slice out of bounds error because a function is getting an empty string so it does something like: `""[5]`.
        Opus will add a bunch of length & nil checks to "fix" this, but the actual issue is the string should never be empty. The nil checks are just papering over a deeper issue, like you probably need a schema level check for minimum string length.
        At that point do you just tell it like "no delete all that, the string should never be empty" and let it figure that out, or do I basically need to pseudo code "add a check for empty strings to this file on line 145", or do I just YOLO and know the issue is gone now so it is no longer my problem?
        My bigger point is how does an LLM know that this seemingly small problem is indicative of some larger failure, like lets say this string is a `user.username` which means users can set their name to empty which means an entire migration is probably necessary. All the AI is going to do is smoosh the error messages and kick the can.
        [-]
        julian37 14 minutes ago
        Use planning+execution rather than one-shotting, it'll let you push back on stuff like this. I recommend brainstorming everything with https://github.com/obra/superpowers, at least to start with.
        Then work on making sure the LLM has all the info it needs. In this example it sounds like perhaps your hypothetical data model would need to be better typed and/or documented.
        But yeah as of today it won't pick up on smells as you do, at least not without extra skills/prompting. You'll find that comforting or annoying depending on where you stand...
        UI_at_80x24 25 minutes ago
        I have encountered the exact same kind of frustration, and no amount of prompting seems to prevent it from "randomly" happening.
        `the error is on line #145 fix it with XYZ and add a check that no string should ever be blank`
        It's the randomness that is frustrating, and that the fix would be quicker to manually input that drives me crazy. I fear that all the "rules" I add to claude.md is wasting my available tokens it won't have enough room to process my request.
        [-]
        unshavedyak 14 minutes ago
        Yup, this is why i firmly believe true productivity, as in, it aiding you to make you faster, is limited by the speed of review.
        I think Claude makes me faster, but the struggle is always centered around retaining own context and reviewing code fully. Reviewing code fully to make sure it’s correct and the way I want it, retaining my own context to speed up reviews and not get lost.
        I firmly believe people who are seeing massive gains are simply ignoring x% lines of code. There’s an argument to be made for that being acceptable, but it’s a risk analysis problem currently. Not one I subscribe to.
        dpkirchner 30 minutes ago
        Not the person you're replying to but yes, sometimes I do tell the agent to remove the cruft. Then I back up a few messages in the context and reword my request. Instead of just saying "fix this crash", or whatever, I say "this is crashing because the string is empty, however it shouldn't be empty, figure out why it's empty". And I might have it add some tests to ensure that whatever code is not returning/passing along empty strings.
      - rattlesnakedave 1 hour ago
        “I struggle to believe that a ton of seemingly intelligent software engineers are too dumb to figure out how to use Claude code to get reliable results”
        Seemingly is doing the heavy lifting here. If you read enough comment threads on HN, it will become obvious why they aren’t getting results.
        [-]
        alwillis 32 minutes ago
        > I struggle to believe that a ton of seemingly intelligent software engineers are too dumb to figure out how to use Claude code to get reliable results.
        They're not dumb, but I'm not surprised they're struggling.
        A developer's mindset has to change when adding AI into the mix, and many developers either can’t or won’t do that. Developers whose commits that look something like "Fixed some bugs" probably aren’t going to take the time to write a decent prompt either.
        Whenever there's a technology shift, there are always people who can't or won't adapt. And let's be honest, there are folks whose agenda (consciously or not) is to keep the status quo and "prove" that AI is a bad thing.
        No wonder we're seeing wildly different stories about the effectiveness of coding agents.
        dandellion 57 minutes ago
        Here's my 100 file custom scaffolding AI prompt that I've been working on for the last four months, and can reliably one-shot most math olympic problems and even a rust to do list.
    - embedding-shape 1 hour ago
      > and I have to consume it in the way that it's presented
      I'm just curious, why do you "have to"? Don't get me wrong, I'm making the same choice myself too, realizing a bunch of global drawbacks because of my local/personal preference, but I won't claim I have to, it's a choice I'm making because I'm lazy.
      [-]
      - wongarsu 1 hour ago
        What are the reasonable options besides a Claude Code subscription (or an equivalent from Codex or Copilot)?
        I could pay API prices for the same models, but aside from paying much more for the same result that doesn't seem helpful
        I could pay a 4-5 figure sum for hardware to run a far inferior open model
        I could pay a six figure sum for hardware to run an open model that's only a couple months behind in capability (or a 4-5 figure sum to run the same model at a snail's pace)
        I could pay API costs to semi-trustworthy inference provider to run one of those open models
        None of those seem like great alternatives. If I want cutting-edge coding performance then a subscription is the most reasonable option
        Note that this applies mostly to coding. For many other tasks local models or paid inference on open models is very reasonable. But for coding that last bit of performance matters
        [-]
        prabal97 22 minutes ago
        I use my OAI subscription on my Claude Code. I get the benefit of the Claude Code interface with the intelligence of OAI models.
        https://prabal.ca/posts/claude-code-chatgpt-subscription/
      - echelon 1 hour ago
        My job title is "provide value".
        I'm given a tool that lets me 10x "provide value".
        My personal preferences and tastes literally do not matter.
        [-]
        embedding-shape 1 hour ago
        As a professional you have a choice in how you produce whatever it is you produce. Sure, you can go for the simplest, most expensive and "easiest" way of doing things, or you can do other things, depending on your perspective and requirements. None of this is set in stone, some people make choices based on personal preferences, and that matters as much to them as your choices matter to you.
    - ipaddr 45 minutes ago
      I'm getting 1,000x improvement building notepad applications with 6 9s. No one is faster.
      Need some help selling these notepad apps, do you have a prompt for that?
    - epistasis 1 hour ago
      I'm still reviewing all the code that's created, and asking for modifications, and basically using LLMs as a 2000 wpm typist, and seeing similar productivity gains. Especially in new frameworks! Everything is test driven development, super clean and super fast.
      The challenge now is how to plan architectures and codebases to get really big and really scale, without AI slop creating hidden tech debt.
      Foundations of the code must be very solid, and the architecture from the start has to be right. But even redoing the architecture becomes so much faster now...
    - britzkopf 1 hour ago
      > And before anyone accuses me of being some "vibe coder", I've built five nines active-active money rails that move billions of dollars a day at 50kqps+, amongst lots of other hard hitting platform engineering work. Serious senior engineering for over a decade
      You sound like a pro wrestler. I'd like to know what "hard-hitting" engineering work is. Hydraulic hammers?
      [-]
      - dmoy 47 minutes ago
        I mean five nines is legitimately difficult to accomplish for a lot of problem spaces.
        It's also like.... difficult to honestly and accurately measure. And account for whether or not you're getting lucky based on your underlying dependencies (servers, etc) not crashing as much as advertised, or if it's actually five nines. Or whether you've run it for a month and gotten <30s of measure downtime and declared victory, vs run it for three years with copious software updates.
        I always assume most people claiming five nines are just not measuring it correctly, or have not exercised the full set of things that will go wrong over a long enough period of time (dc failures, network partitions, config errors, bad network switches that drop only UDP traffic on certain ports, erroneous ACL changes, bad software updates, etc etc)
        Maybe they did it all correct though, in which case, yea, seems hard hitting to me.
    - surgical_fire 19 minutes ago
      I read this as satire. I still think it is.
    - blurbleblurble 1 hour ago
      > fact I can't run Opus locally
      Yet
- michael_j_x 1 hour ago
  I really enjoy coding. I've build a number of projects, personal and professional, with Python, Rust, Java and even some Scala in the mix. However, I've been addicted to Claude Code recently, especially with the superpowers skill. It feels like I can manifest code with my mind. When developing with Claude, I am presented with design dilemmas, architectural alternatives, clarification questions, things that really make me think about the problem. I then choose a solution, propose alternatives, discuss, and the code manifests. I came to realize that I enjoy the problem solving, not the actual act of writing the code. Like I have almost cloned my self, and my clones are working on the projects and coming back to me for instructions. It feels amazing
  [-]
  - throw4847285 1 hour ago
    "Addicted" "Superpowers" "manifest with my mind" "it feels amazing"
    Why does it sound like you're on drugs? I know that sounds extremely rude, but I can't think of any other reasonable comparison for that language.
    It's hard to take these kinds of endorsements seriously when they're written so hyperbolically, in terms of the same cliches, and focused on entirely on how it makes you feel rather than what it does.
    [-]
    - cbg0 1 hour ago
      Reading a bunch of posts related to Claude Code and some folks voice genuine upset about rate limits and model intelligence while others seem very upset they can't get their fix because they've reached the five hours limits is genuinely concerning to how addictive LLMs can be for some folks.
      [-]
      - throw4847285 1 hour ago
        I think the social aspect is underreported. I think this applies even for people using Claude Code and not just those treating an LLM as a therapist. In other words, I wonder how many of these people can't call their doctor to make an appointment or call a restaurant to order a pizza. And I say this as someone who struggles to do those things.
        People claim that DoorDash and other similar apps are about efficiency, but I suspect a large portion is also a desire to remove human interaction. LLMs are the same. Or, in actuality, to create a simulacrum of human interaction that is satisfying enough.
      - vidarh 54 minutes ago
        It's reflecting the value we get from it, relative to the cost of continuing if we switch to the API pricing. It is genuinely upsetting to hit the limits when you face a substantial drop in productivity.
        Imagine being an Uber driver and suddenly have to switch to a rickshaw for several hours.
    - djmips 1 hour ago
      The drug is the llm coding. I kind of get it, when I was a kid and first got a computer I felt the same way after I learned assembly language. The world is your oyster and you can do what felt like anything. It was why I spent almost every waking hour working on my computer. That wore off eventually but I've spent some time on my backlog of projects with Claude and it feels bit like the old days again.
    - michael_j_x 1 hour ago
      "superpowers" is the exact name of the specific Claude code skill. The rest of your concerns is just me expressing my excitement, as until recently I was very skeptic of the whole vibe-coding movement, but have since done a complete 180.
    - guzfip 1 hour ago
      > Why does it sound like you're on drugs, specifically cocaine?
      This has basically been what all of Silicon Valley sounds like to me for a few years now.
      They are known for abusing many psycho-stimulants out there. The stupid “manifesto” Marc Andreessen put out a while back sounded like adderall-produced drivel more than a coherent political manifesto.
      [-]
      - throw4847285 1 hour ago
        If I were to go off into the woods, take a lot of drugs, and write my own crank manifesto, the central conceit would be that ADHD is the key to understanding the entirety of Silicon Valley. A bunch of people with stimulus driven brains creating technologies that feed themselves and the rest of the populace more and more stimulation, setting a new baseline and requiring new technologies for higher levels of stimulation in an endless loop until we all stimulate ourselves to death. Delayed gratification is the enemy.
        This is similar to how we have already found hacks in our evolutionary programming to directly deliver high amounts of flavor without nutrition, and we've been working on ever more complex means of delivering social stimulation without the need for other human (one of the key appeals of AI for many people, as well).
        Of course these are all the ravings of a crank and should be ignored.
        [-]
        throwaway27448 1 hour ago
        No, you're right. But a million monkeys on cocaine may eventually provide value to shareholders.
  - withinboredom 1 hour ago
    I feel this sentiment. It’s more like pair programming with someone both smarter and dumber than you. If you’re reviewing the code it is putting down, you’re likely to spot what it’s getting wrong and discussing it.
    What I don’t understand, are the people who let it go over night or with whole “agent teams” working on software. I have no idea how they trust any of it.
  - snarfy 1 hour ago
    Yep, I want to make stuff. Writing the code by hand was just a means to an end.
  - skydhash 1 hour ago
    That’s like saying enjoying composing music, but not enjoying playing music. Or creating stories, but don’t like writing. Yes they’re different activities, but linked together. The former is creativity, the latter is a medium of transmission.
    Code is notation, just like music sheets, or food recipes. If your interaction with anyone else is with the end result only (the software), the. The code does not matter. But for collaboration, it does. When it’s badly written, that just increase everyone burden.
    It’s like forcing everyone to learn a symphony with the record instead of the sheets. And often a badly recorded version.
    [-]
    - vidarh 49 minutes ago
      > That’s like saying enjoying composing music, but not enjoying playing music
      Do you think that is impossible? There are plenty of people who enjoy composing music on things like trackers, with no intent of ever playing said music on an instrument.
      I love coding, but I also like making things, and the two are in conflict: When I write code for the sake of writing code, I am meticulous and look for perfection. When I make things, I want to move as fast as possible, because it is the end-product that matters.
      There is also a hidden presumption in what you've written that 1) the code will be badly written. Sometimes it is, but that is the case for people to, but often it is better than what I would produce (say, when needing to produce something in a language I'm not familiar enough with), 2) and that the collaboration will be with people manually working on the code. That is increasingly often not true.
    - michael_j_x 1 hour ago
      Using your analogy, I enjoy composing music and enjoy playing music. I don't enjoy going through the notion of writing the notes on a piece of paper with the pen. I have to do it because people can't read my mind, but if they could I would avoid it. Claude code is like that. The code that gets written, feels like the code that I would have written
- woctordho 1 hour ago
  Apart from local AI, a serious choice is aggregated API such as new-api [0]. An API provider aggregated thousands of accounts has much better stability than a single account. It's also cheaper than the official API because of how the subscription model works, see e.g. the analysis [1].
  [0] https://github.com/QuantumNous/new-api
  [1] https://she-llac.com/claude-limits
  [-]
  - gruez 1 hour ago
    >An API provider aggregated thousands of accounts has much better stability than a single account
    Isn't this almost certainly against ToS, at least if you're using "plans" (as opposed to paying per-token)?
    [-]
    - woctordho 1 hour ago
      You don't even need to be a customer served by Anthropic or OpenAI so the Terms of Service are irrelevant. That's how I live in China and use almost free Claude and GPT which they don't sell here.
      [-]
      - gruez 1 hour ago
        Wait, is this just something like openrouter, that routes your requests to different API providers, where you're paying per-token rates? Or is this taking advantage of fixed price plans, by offering an API interface for them, even though they're only supposed to be used with the official tools?
        [-]
        woctordho 55 minutes ago
        It's taking advantage of fixed price plans or even free plans.
    - throwaway27448 1 hour ago
      That seems like Anthropic's problem.
      [-]
      - gruez 1 hour ago
        It's going to be quickly your problem when they figure out you're breaching ToS and ban your account.
- supriyo-biswas 1 hour ago
  > if instead of LLMs, this were some other tool or service, reactions to these events would have been far more pragmatic, with less of a reticence to invest time on in-house solutions when dealing with flaky vendors
  As an example, a long term goal at the employer I work for is exactly this: run LLMs locally. There's a big infrastructure backlog through, so it's waiting on those things, and hopefully we'll see good local models by then that can do what Claude Sonnet or GPT-5.3-Codex can do today.
- torben-friis 47 minutes ago
  People will always go along with a removal of friction even against their benefit. It's natural bias, we have a preference for not spending energy.
  It's why we pay stupid amounts for takeout when it's a button away, it's why we accept the issues that come with online dating rather than breaking the ice outside, it's why there's been decades scams that claim to get you abs without effort...
  LLMs are the ultimate friction removal. They can remove gaps or mechanical work that regular programming can, but more importantly they can think for you.
  I'm convinced this human pattern is as dangerous as addiction. But it's so much harder to fight against, because who's going to be in favor of doing things with more effort rather than less? The whole point of capitalism is supposed to be that it rewards efficiency.
  [-]
  - xantronix 31 minutes ago
    > stupid amounts for takeout
    Aw hell. You found my vice and my own cognitive dissonance here. If I want to truly stand by my convictions, I should probably cook more and log off. Waiting for signs that the tides are turning and that people are beginning to value a slower, more methodical approach again isn't doing anything in the current moment to stave off the genuine feelings of dread that have honestly led to some suicidal ideation.
    (this is serious and not sarcasm, by the way)
    [-]
    - abnercoimbre 5 minutes ago
      Stand strong. Plenty of us working towards healthy offline communities.
- DeathArrow 1 hour ago
  >I get it. LLMs are cool technology.
  It would be cool to run SOTA models on my own hardware but I can't. Hence, the subscription.
- jimmaswell 1 hour ago
  Contingency plan? Just code without it like before. AI could disappear today and I would be very disappointed but it's not like I forgot how to code without it. If anything, I think it's made me a better programmer by taking friction away from the execution phase and giving me more mental space to think in the abstract at times, and that benefit has certainly carried over to my work where we still don't have copilot approved yet.
  [-]
  - xantronix 24 minutes ago
    [dead]
- dyauspitr 1 hour ago
  I think most people understand the need for subscriptions here. It is an ongoing massive compute cost, and that’s what you’re paying for. Your local system is not capable of running the massive amount of compute required for this. If it were then we would see more people up in arms about it.
  [-]
  - stephbook 1 hour ago
    We could run it locally, but the problems that matter simply don't change.
    We're paying for servers that sit idle at night, you don't find enough sysadmins for the current problems, the open source models aren't as strong as closed source, providing context (as in googling) means you hook everything up to the internet anyway, where do you find the power and the cooling systems and the space, what do you do with the GPUs after 3 years?
    Suddenly that $500/month/user seems like a steal.
- stefan_ 52 minutes ago
  I would love nothing more than ditching Claude for a local solution tomorrow. But it doesn't exist today, so it is what it is - you gotta keep up with the joneses.
  Maybe in 5 years we'll have an open weights model that is in the "good enough" category that I can run on a RTX 9000 for 15k dollars or whatever.
  [-]
  - xantronix 18 minutes ago
    I don't want to get too much into the details but I don't work in or for the Valley and I don't think I'll ever be able to afford that sort of expenditure on computing. A down payment on a car, or a vital medical procedure? Sure. I'm probably not alone here.
- cookiengineer 1 hour ago
  The great part is that you can always build your own selfhosted tools. There is nothing that can't be done at home, it's just a calculation of how much you're willing to spend.
  Lately though the RAM crisis is continuing and making things like this more unfeasible. But you can still use a lot of smaller models for coding and testing tasks.
  Planning tasks I'd use a cloud hosted one, for now, because gemma4 isn't there yet and because the GPU prices are still quite insane.
  The cool and fun part is that with ollama and vllm you can just build your own agentic environment IDE, give it the tools you like, and make the workflow however you like. And it isn't even that hard to do, it just needs a lot of tweaking and prompt fiddling.
  And on top of that: Use kiwix to selfhost Wikipedia, stackoverflow and devdocs. Give the LLM a tool to use the search and read the pages, and your productivity is skyrocketing pretty quickly. No need anymore to have internet, and a cheap Intel NUC is good enough for self-hosting a lot of containers already.
  Source: I am building my own offline agentic environment for Golang [1] which is pretty experimental but sometimes it's also working.
  [1] https://github.com/cookiengineer/exocomp
  [-]
  - xantronix 20 minutes ago
    I'm definitely all in on self-hosting, though I rent my compute and pay for bandwidth with Linode and storage with rsync.net.
    The LLM bit though, personally, is just not for me.
- onlyrealcuzzo 1 hour ago
  [dead]
- garganzol 1 hour ago
  [dead]
mvkel 2 hours ago
Mounting evidence that claude max users are put into one big compute fuel pool. Demand increased dramatically with OpenAI's DoD PR snafu (even though Anth was already working with the DoD? But I digress...). The pool hit a ceiling. Anth has no compute left to give. Hence people maxing out after 1 query. "Working on it" means finding a way to distill Claude Code that isn't enough of a quality degradation to be noticed[0], in order to get the compute pool operational again. The distillation will continue until uptime improves.
0 as of this writing, it's noticeable. Lots of "should I continue?" And "you should run this command if you want to see that information." Roadblocks that I hadn't seen in a year+
[-]
- bmurphy1976 42 minutes ago
  There's another piece of the puzzle. Dario has very clearly stated they are not taking the OpenAI approach of spending $trillion to scale right now and assume the money comes later. They are spending significantly less and working towards profitability sooner.
  That means they are going to be far more constrained infrastructurally than some of the competition. I think this is some of the constraints that we are seeing.
  [-]
  - mvkel 23 minutes ago
    He did say that. And, in virtually the same breath, said they would have to spend $trillions if they hope to remain SOTA, which they have to be [0],[1].
    They don't have compute because they didn't play the game and get the good rates a couple of years ago, and are now forced to work with third-rate providers. That's not a strategy.
    I would take everything he says with a huge grain of salt.
    [0] “We’re buying a lot. We’re buying a hell of a lot. We’re buying an amount that’s comparable to what the biggest players in the game are buying.”
    “Profitability is this kind of weird thing in this field. I don’t think in this field profitability is actually a measure of spending down versus investing in the business.”
    [1] “You don’t just serve the current models and never train another model, because then you don’t have any demand because you’ll fall behind.”
    So he's not spending so they can be profitable, AND spending as much as the biggest players are spending, AND not really looking at profit as a measure of anything? K.
- 827a 1 hour ago
  Alternatively, the elephant in the room I'm surprised no one wants to talk about: the vibe coding is catching up with them.
  [-]
  - xmprt 1 hour ago
    I don't think anyone is talking about it because it's not a very productive conversation to have. I'm not particularly bullish on vibe coding either but if you could explain what exactly about vibe coding causes these specific issues then it could be more interesting to discuss.
    But as it stands, the more likely reason is capacity crunch caused by a chips shortage and demand heavily outpacing supply. You vibe coding reason is based on as much vibes as their code probably is.
    [-]
    - leptons 21 minutes ago
      Vibe coding does not usually produce performant code, it produces spaghetti with the goal of making the user asking for work to be done to go away as soon as possible with a (often barely) working solution.
      I recently vibe-translated a simple project from Javascript to C, where Javascript was producing 30fps, and the first C version produced 1 frame every 20 seconds. After some time trying to get the AI to optimize it, I arrived at 1fps from the C project. Not a win, but the AI did produce working C code.
      I have no doubt that if I had done this myself (which I will do soon), with the appropriate level of care, it would have been 30fps or more.
  - muyuu 1 hour ago
    that is a separate issue indeed, but their comms make it rather obvious they are scrambling to reduce compute and they're just slashing their service selectively - with openclaw and max users being the first in the chopping block
  - throwaway27448 1 hour ago
    It should catch up faster. It's absolutely useless for the bulk of the tedium—notably, soldering together random repos to satisfy executives—that makes up my job now.
    [-]
    - stemlord 33 minutes ago
      That's mostly been my job since I started around 2010 haha
  - eatsyourtacos 1 hour ago
    That's not an elephant in the room.. it's just proof of how insanely useful the tool is and the reality that so much more hardware is needed. Thus people saying "why are these companies building insanely large data centers" ... this is why!
    [-]
    - kartoffelsaft 1 hour ago
      The problem is that vibe-coding, when it fails (i.e. it's non-useful, at least for a bit), is usually solved by more vibes. Try again and hope it works. Ask it to refactor and hope the cleaner code helps it along. If you're willing to think about the code yourself you'll likely ask it questions about the codebase. High vibe-code usage is both a metric that it is working and that it's failing.
    - georgeecollins 1 hour ago
      I think it is telling that this audience down votes this. It's kind of obvious that the thing is being used a lot. Doesn't mean it works as well as advertised. Doesn't mean the business model they have works. Just means there is a lot of demand. You can't ignore that.
    - otikik 1 hour ago
      That is only true if there's a pricepoint that vibecoders are willing to pay per token that allows Anthropic to make a profit.
    - SpicyLemonZest 1 hour ago
      I have no particular insight into the Anthropic backend, but it's possible in general for systems to have architectural issues which cannot be mitigated by just adding more hardware.
    - skeeter2020 1 hour ago
      maybe you should study up on correlation and causation before you declare "proof"; it's also possible that it goes the other way.
      [-]
      - eatsyourtacos 1 hour ago
        The proof is already there. It's concrete. I've seen it directly the last few months of using claude code. It closed the loop. It's insanely beneficial when used properly- that is a pure fact. You act like it's an opinion.
- vitosartori 1 hour ago
  I was vacationing! What's up with OpenAI now? Asking with some morbid curiosity tbh.
  [-]
  - feature20260213 1 hour ago
    Nothing, Effective Altruist dweebs realizing that the world isn't their psychology experiment.
  - dyauspitr 1 hour ago
    As a person that hasn’t used Claude code before, I’ve been using OpenAI’s Codex and it is pretty amazing. I wonder how much more amazing Claude is.
    [-]
    - Someone1234 1 hour ago
      Both are great, where they differ is: Claude Code has a better instinct than Codex. Meaning it will naturally produce things like you, the developer, would have.
      Codex shines really well at what I call "hard problems." You set thinking high, and you just let it throw raw power at the problem. Whereas, Claude Code is better at your average day-to-day "write me code" tasks.
      So the difference is kind of nuanced. You kind of need to use both a while to get a real sense of it.
      [-]
      - mchusma 1 hour ago
        I think the way I and others use it is code with clause, review or bug hunt with codex. Then I pass the review back to Claude for implementation. Works well. Better than codex implementation and finds gaps versus using Claude to review itself in my opinion.
  - binarymax 1 hour ago
    Codex switched to paid API tokens only. Not to mention their alignment with the department of war.
    [-]
    - winterqt 1 hour ago
      > Codex switched to paid API tokens only.
      They’re still doing subscriptions: https://developers.openai.com/codex/pricing
      [-]
      - bachmeier 1 hour ago
        I'm happy I invested in setting up Codex CLI and getting it to work with ollama. For the toughest jobs I can use Github Copilot (free as an academic) or Gemini CLI. If we see the per token price increase 5x or 10x as these companies move to focusing on revenue, local models will be the way to go, so long as stuff like Gemma 4 keeps getting released.
    - andai 1 hour ago
      Can you give context for the API thing?
      Edit: Looks like it still works with subs, they just measure usage per token instead of per message.
  - Someone1234 1 hour ago
    Codex just changed the way they calculate usage with a massive negative impact.
    Before a Subscription was the cheapest way to gain Codex usage, but now they've essentially having API and Subscription pricing match (e.g. $200 sub = $200 in API Codex usage).
    The only value of a subscription now is that you get the web version of ChatGPT "free." In terms of raw Codex usage, you could just as easily buy API usage.
    [-]
    - embedding-shape 1 hour ago
      > e.g. $200 sub = $200 in API Codex usage [...] In terms of raw Codex usage, you could just as easily buy API usage.
      I don't think it's made out like that, I'm on the ChatGPT Pro plan for personal usage, and for a client I'm using the OpenAI API, both almost only using GPT 5.4 xhigh, done pretty much 50/50 work on client/personal projects, and clients API usage is up to 400 USD right now after a week of work, and ChatGPT Pro limit has 61% left, resets tomorrow.
      Still seems to me you'd get a heck more out of the subscription than API credits.
      [-]
      - Archit3ch 45 minutes ago
        This. ChatGPT Pro personal at $20/month and using GPT 5.4 xhigh is the best deal currently. I don't know if they are actually losing money or betting on people staying well under limits. Clearly they charge extra to businesses on the API plans to make up for it.
        In the future, open models and cheaper inference could cover the loss-leading strategies we see today.
      - Someone1234 44 minutes ago
        Right, because you're on the old and not new structure.
        They just rolled it out for new subscribers and existing ones will be getting it in the "coming weeks." Enterprise already got hit with this from my understanding.
      - nickthegreek 58 minutes ago
        ChatGPT Personal Pro plan hasnt had the change yet. It is rolling out to Enterprise users first.
    - postalcoder 1 hour ago
      This is not true. The change applies to the credits, ie the incremental usage that exceeds your subscription limits.
      [-]
      - Someone1234 40 minutes ago
        OpenAI's own help page suggests otherwise.
- brenoRibeiro706 2 hours ago
  I agree; I think that's what happened. But it's a shame—I'm having a lot of trouble with poor-quality results from Claude-Code, and the session limit is being used up quickly.
- garganzol 2 hours ago
  Makes sense, even plan name seems to agree: "Claude Max".
  [-]
  - bradgessler 2 hours ago
    Reminds me of an “all you can eat” buffet I was at once where the owner told me, “that’s it, that’s all you can eat” and cut it off.
    [-]
    - 9991 2 hours ago
      Did this prompt some reflection on your part?
      [-]
      - ceejayoz 2 hours ago
        What, that businesses lie?
        [-]
        mc32 1 hour ago
        It’s like most lifetime warranties. They’re not what they seem to mean colloquially. They have a contractual meaning.
        [-]
        skeeter2020 1 hour ago
        Or my grocery store that has certain products with signs "Always $n" but over the past 5+ years n has increased regularly and dramatically.
        scottyah 1 hour ago
        They never said WHOSE lifetime...
        [-]
        mc32 1 hour ago
        I think with some it’s the lifetime of the product line. If it’s sunset, no longer sold; lifetime ends when support ends.
        ceejayoz 29 minutes ago
        That we accept the lying doesn't mean it's good.
    - michaelbarton 1 hour ago
      Sounds like the most blatant case of false advertising since the movie The Neverending Story
- jimkleiber 2 hours ago
  I looked at my cc usage and I was at 90% of my weekly allowance after 3 days of use...BUT, if I looked at the usage stats with the chart, it showed, on a scale of 1-4 intensity of usage (4 being most intense), the three days as such:
  Day 1: 2
  Day 2: 3
  Day 3: 1
  Not sure how I can hit such limits so quickly with such low scores on its own chart.
  [-]
  - skywhopper 1 hour ago
    The limits are smaller now, is how.
    [-]
    - jimkleiber 52 minutes ago
      Then why not update their chart to at least say that? The numbers (shading, actually) on the chart are not absolute numbers, they're relative, so just make it look as if I spent a lot of time on it. If they're gonna change their limits without being clear about it, at least go all the way. Right now, I can go, "See, you're actually saying I didn't use that much compared to the limits."
    - cube00 44 minutes ago
      Which is fine, but the way they're tightening the screws, and not saying until they announce the results of their A/B tests is very frustrating.
- politelemon 2 hours ago
  What is openai's involvement here, as I am out of the loop.
  [-]
  - ezfe 2 hours ago
    Claude: Autonomous weapons and domestic surveillance are our red line
    Pentagon: No
    OpenAI: We are okay if the line is merely a suggestion and we encourage you not to cross it!
    Pentagon: Yes we pick that option
  - masklinn 2 hours ago
    I assume it's anthropic rejecting the US Government's use of their software for domestic mass surveillance or fully autonomous weapons, and openai happily agreeing to it.
    That has led to a significant number of people switching over from openai, or at least stating they were going to do so.
  - Analemma_ 2 hours ago
    They made a $25 million donation to Trump, which was repaid in kind by designating Anthropic a supply chain risk. Unfortunately, they weren’t nearly subtle enough about this, and went “sure, we’ll take over the contract with no limits on killbots or domestic surveillance, no problem!” on the same day as Anthropic got in trouble, and people put two and two together.
- BlueRock-Jake 1 hour ago
  On the nose. Dealt with this last week. Ran maybe 5 queries (not even in code) and was maxed out for the day. What a great way to spend my money
- ramon156 2 hours ago
  Distillation is how they're planning to make money? What a poor strategy. This is next level FOMO (Fear Of Not Being The #1 LLM Provider).
  I have cancelled my subscription last week, I'll see them when they fix this nonesense
- guelo 1 hour ago
  The responsible thing would be to not sell way more subscriptions than their capacity. But they have to show the exponential revenue curves to their investors. I cancelled my subscription yesterday.
ajb92 2 hours ago
The trend on the status page[1] does not inspire confidence. Beginning to wonder if this might be a daily thing.
[1] https://status.claude.com/
[-]
- aurareturn 2 hours ago
  They went from $9b ARR at the end of 2025 to $30b ARR today. That's more than 3x the size in 3 months. I expect growing pains.
  For some context, they added 2x Palantir or .75x Shopify or .68x Adobe annual revenue in March alone.
  [-]
  - rpozarickij 53 minutes ago
    It's also worth keeping in mind that Anthropic's compute needs are nothing like those of a company like Shopify or Adobe, so revenue might not paint accurately enough the picture of what they're dealing with right now.
  - twelvechairs 2 hours ago
    Yeah its huge demand upswing from the growth of openclaw and similar pushing resources. Very clear from recent changes and announcement around this [0]
    Fwiw there are worse delays from second tier providers like moonshot's kimik2.5 that are also popular for agentic use.
    [0] https://news.ycombinator.com/item?id=47633396
  - samlinnfer 2 hours ago
    And they are early adopters of the vibe coding paradigm, having a 100% Claude generated codebase.
    [-]
    - aurareturn 2 hours ago
      I assume most of their outages is related to this insane scaling and lack of available compute.
      Vibe coding doesn't automatically mean lower quality. My codebase quality and overall app experience has improved since I started using agents to code. You can leverage AI to test as well as write new code.
      [-]
      - CharlieDigital 2 hours ago
        > I assume most of their outages is related to this insane scaling and lack of available compute. > > Vibe coding doesn't automatically mean lower quality
        Scalability is a factor of smart/practical architectural decisions. Scalability doesn't happen for free and isn't emergent (the exact opposite is true) unless it is explicitly designed for. Problem is that ceding more of the decision making to the agent means that there's less intentionality in the design and likely a contributor to scaling pains.
        [-]
        aurareturn 2 hours ago
        My theory is that most of their outages are compute and scale related. IE. A few GPU racks blows out and some customers see errors. They don't have any redundant compute as backup because supply is constrained right now. They're willing to lower reliability to maximize revenue.
        bpodgursky 1 hour ago
        This is only true for small companies that can infinitely scale within AWS without anyone noticing.
        You are talking about software scaling patterns, Anthropic is running into hardware limitations because they are maxing out entire datacenters. That's not an architectural decision it's a financial gamble to front-run tens of billions in capacity ahead of demand.
        colordrops 2 hours ago
        Why would you think that the person you are replying to didn't design in scalability? What exactly are emergent features when vibe coding? If scalability is an explicit requirement it can be done.
        [-]
        CharlieDigital 1 hour ago
        > What exactly are emergent features when vibe coding?
        Regression to the mean. See the other HN thread[0]
        The LLM has no concept of "taste" on its own.
        Scalability, in particular, is a problem that goes beyond the code itself and also includes decisions that happen outside of the codebase. Infrastructure and "platform" in particular has a big impact on how to scale an application and dataset.
        [0] https://dornsife.usc.edu/news/stories/ai-may-be-making-us-th...
      - _fat_santa 2 hours ago
        After the CC leak last week I took a look at their codebase and my biggest criticism is they seem to never do refactoring passes.
        Personally I write something like 80-90% of my code with agents now but after they finish up, it's critical that you spin up another agent to clean up the code that the first one wrote.
        Looking at their code it's clear they do not do this (or do this enough). Like the main file being something like 4000 LOC with 10 different functions all jammed in the same file. And this sort of pattern is all over the place in the code.
        [-]
        sheepscreek 1 hour ago
        I have a buddy who used to work at Shopify and was proud about having sprints dedicated to removing unused features. This is really underrated but is the only reliable way to prevent bloat. Oh and getting rid of bloat is way more satisfying!
        PopePompus 2 hours ago
        How do you do the cleanup? Just /simplify or something you rolled yourself?
        jvuygbbkuurx 1 hour ago
        If that works it's already built in the system
      - RC_ITR 2 hours ago
        Isn't the whole selling point of AI agents that you now can do things like scale 3x without scaling your team accordingly?
        [-]
        monooso 2 hours ago
        I haven't seen anyone claim that applies to infrastructure or compute.
        [-]
        dpark 1 hour ago
        Since apparently LLMs have also conquered physics, “Claude, transmute this lead to gold for me.”
        [-]
        RC_ITR 1 hour ago
        Yeah, it's almost like the point I was making is that everyone is overselling AI agents' capabilities.
        [-]
        dpark 57 minutes ago
        I’m sure someone is out there claiming that AI is going to solve all your business’s problems no matter what they are. Remotely sane people are saying it will solve (or drastically improve) certain classes of problems. 3x code? Sure. 3x the physical hardware in a data center? Surely not.
        RC_ITR 1 hour ago
        Implying that software is somehow divorce from Infrastructure/compute efficiency and utilization isn't a claim I've seen many make either.
        aurareturn 2 hours ago
        I assume so. They're doing it with around 99% uptime.
        [-]
        daveguy 51 minutes ago
        Or, to put it another way, almost 2 9s.
      - samlinnfer 2 hours ago
        Well if we use Claude Code's code quality as a benchmark ...
  - nonameiguess 1 hour ago
    To be clear, this number will probably end up being reasonably accurate, but it is a pet peeve nonetheless in the startup world how shitty these financial metrics have become. We're three months from the end of 2025. You'd think we'd want to see at least two years of $30 billion dollar revenue earned in each year before we say with any meaningful level of statistical validity that they have $30 billion in "annual recurring" revenue.
- sh1mmer 2 hours ago
  They might need to do some vibe refactoring.
  [-]
  - ryandrake 2 hours ago
    2026 may be the year that many companies relearn: there is no problem that can’t be made worse by adding even more code.
  - giwook 2 hours ago
    And then some vibe code reviewing.
  - fb03 2 hours ago
    Outages are already happening, besides that, we need vibe warrooming
- cube00 39 minutes ago
  They've also stopped reporting on the causes too, just "it's resolved" and they move on.
- skippyboxedhero 1 hour ago
  It has been a daily thing for 2-3 months.
kristjansson 1 hour ago
No one is going to like this answer, but there’s a simple solution: pay for API tokens and adjust your use of CC so that the actions you have it take are worth the cost of the tokens.
It’s great to buy dollars for a penny, but the guy selling em is going to want to charge a dollar eventually…
[-]
- Goronmon 1 hour ago
  ...pay for API tokens and adjust your use of CC so that the actions you have it take are worth the cost of the tokens
  Do you feel there is enough visibility and stability around the "Prompt -> API token usage" connection to make a reliable estimate as to what using the API may end up costing?
  Personally, it feels like paying for Netflix based on "data usage" without having anyway for me to know ahead of time how much data any given episode or movie will end up using, because Netflix is constantly changing the quality/compression/etc on the fly.
  [-]
  - kristjansson 1 hour ago
    Time is a relatively good proxy for spend. There are also more ex post diagnostics like count and cost it can write to the status line.
    I agree that ex ante it’s tough, and they could benefit from some mode of estimation.
    Perhaps we can give tasks sizes, like T shirts? Or a group of claudes can spend the first 1M tokens assigning point values to the prospective tasks?
    [-]
    - Goronmon 1 hour ago
      Even time doesn't feel like it would provide consistent information.
      Take the response on another post about Claude Code.
      https://news.ycombinator.com/item?id=47664442
      This reads like even if you had a rough idea today about what usage might look like, a change deployed tomorrow could have a major impact on usage. And you wouldn't know it until after you were already using it.
- Nifty3929 1 hour ago
  This is it. These subscriptions have been heavily subsidized, which was fine when usage was much lower overall. But with so many folks trying to use the tools and soaking up all the chips something has to give.
  Now we’re going to find out what these tools are really worth.
- gonzalohm 1 hour ago
  it's not a subsidy. It's predatory pricing and it should be illegal. I offer you a service at a loss to remove competition and then increase prices once you are stuck with it.
  [-]
  - ronsor 1 hour ago
    Actually, that is illegal.
    [-]
    - daveguy 1 hour ago
      Now we just have to vote for the DOJ that will enforce it. Or at least not just roll over for donations to their crypto scams.
  - bschwarz 1 hour ago
    That's the VC playbook.
- varispeed 19 minutes ago
  The problem with tokens is that they have wrong incentive. The quicker model arrives at the solution the less tokens you have to buy.
  So I noticed the model is purposefully coming with dumb ideas or running around in circles and only when you tell it that they are trying to defraud you, they suddenly come back with a right solution.
- jimkleiber 1 hour ago
  I just want a little predictable insight into how much I get. For example, at a buffet, I know I can only eat so much food and can plan around it. This is like going to a buffet and not knowing how many plates I can take or how big the plates are, and it changes each week, and yet I have to keep paying the same price. Except it's not about eating, it's about my work and deadlines and promises and all that.
  [-]
  - criddell 1 hour ago
    When you hire a person, you don't know what you are going to get out of them today.
    If an hour of an excellent developer's time is worth $X, isn't that the upper bound of what the AI companies can charge? If hiring a person is better value than paying for an AI, then do that.
    [-]
    - jimkleiber 1 hour ago
      Fair on not knowing what you'll get out of someone. But if that varies wildly, I may not want to hire that person. Even with employment, predictability matters a lot. If they underperform too much, I might feel annoyed. If they overperform, I might feel guilty.
      They can charge whatever they want, I think many people like to make business decisions based on relative predictability or at least be more aware that there's a risk. If they want it to be "some weeks you have lots of usage, some weeks less, and it depends on X factors, or even random factors" then people could make a more informed choice. I think now it's basically incredibly vague and that works while it's relatively predictable, and starts to fail when it's not, for those that wanted the implied predictability.
  - _flux 1 hour ago
    That's what these providers want as well, but from the other side. They want to know that a customer won't be able to eat more than certain number of servings, as they need to pay for each of those servings.
    It works out even if some customers are able to eat a lot, because people on average have a certain limit. The limits of computers are much higher.
    [-]
    - jimkleiber 1 hour ago
      Fair, and I think openclaw and all the orchestrators are having agents maxing out the plans. So maybe they figure out a new tier that is agent-run vs human-run. Agents are much more insatiable, whereas humans have a limit. Not sure if it'd be possible to split between those two different modes, but I think that might address the appetite issue better.
  - kristjansson 1 hour ago
    If you need the tokens for real work, that’s what the API and the other providers like Bedrock are for. The subscription product is merely to whet your appetite.
    [-]
    - jimkleiber 55 minutes ago
      Well then I would just not use their service. I used extra usage once and just for what I'd consider a low amount of tests and coding, racked up like $300 in an hour or more. For some, that's not a lot of money, for me, I'd just code it manually, especially without knowing almost any way to gauge how much I'll need and how fast it goes.
      I'm not sure how businesses budget for llm APIs, as they seem wildly unpredictable to me and super expensive, but maybe I'm missing something about it.
    - gowld 1 hour ago
      Missing the point. I don't choose which tokens to buy. I send a request and the server decides how much it costs after its done.
laacz 7 minutes ago
I'm more surprised that OpenAI is extremely subsidising their ChatGPT subscriptions. With Plus you can do a lot more than with Calude's x5 Max. Is it an expense they just can afford, while people have not migrated over from CC?
SkyPuncher 1 hour ago
My biggest frustration right now is the seeming complete loss of background agent functionality. Permissions seem completely botched for background agents right now. When that happens, the foreground agent just takes over the task despite:
1. Me not wanting that for context management reasons
2. It burning tokens on an expensive model.
Literally a conversation that I just had:
* ME: "Have sonnet background agent do X"
* Opus: "Agent failed, I'll do it myself"
* Me: "No, have a background agent do it"
* Opus: Proceeds to do it in the foreground
* Flips keyboard
This has completely broken my workflows. I'm stuck waiting for Opus to monitor a basic task and destroy my context.
nathell 1 hour ago
HN’s guidelines say ‘Don’t editorialize’. The original title here is ‘[BUG] Claude Code login fails with OAuth timeout on Windows’, which is more specific and less clickbaity.
giancarlostoro 2 hours ago
Looks to be sourced from an outage:
https://status.claude.com/
JohnMakin 53 minutes ago
The commenters here don't seem to realize this was posted during the outage yesterday that affected login for most claude code users.
DiffTheEnder 2 hours ago
I'm finding queries are taking about 3x as long as they used to regardless of whether I use Sonnet or Opus (Claude Code on Max)
butz 26 minutes ago
Run LLMs locally. Otherwise suffer service disruptions and very likely price hikes in the future.
ivanjermakov 1 hour ago
Wonder what the next AI winter trigger would be. Coding agent client collapsing under its own tech debt?
[-]
- bachmeier 59 minutes ago
  I think it's been clear from the beginning that the per-token price of usage was far below what it will be when firms have implemented their profit-maximizing price plans. "AI winter" will happen when these firms start maximizing profit. At that point it'll be too expensive for all but certain use cases to use the best technology for work.
  We'll see AI chat replace Google, we'll see companies adopting AI in high-value areas, and we'll see local models like Gemma 4 get used heavily.
  AI winter will see a disappearance of the clickbait headlines about everyone losing their jobs. Literally nobody is making those statements taking into account that pricing to this point is way less than the profit maximizing level.
fabbbbb 1 hour ago
Is this really relevant news? Please share more bug reports from popular services and tools. Feels a tiny bit biased. My CC is just fine since at least three weeks.
websap 1 hour ago
Isn't it a little weird that we trust this app to help us build some of the most important parts of our business and the company that vends this app keep breaking it in unique ways.
At my workplace we have been sticking with older versions, and now stick to the stable release channel.
[-]
- scottyah 1 hour ago
  I like dogfooding. You can use Azure if you want infra that is clearly not being used, tested, and pushed to the limits by its own creators.
m3kw9 10 minutes ago
How are they making billions with reliability like that?
alasano 1 hour ago
If you prepare yourself a token with "claude setup-token" (presuming you're not already locked out and had one) you can run "CLAUDE_CODE_OAUTH_TOKEN=sk.. claude" to use your account.
CapmCrackaWaka 2 hours ago
If anthropic‘s reliability becomes a meme, they risk brand death like Microsoft. Go to hand it to them though, they’re really living that “AI writes all of our code and it should write your code too” life.
[-]
- fleischhauf 21 minutes ago
  I'm quite impressed on how far they got while the claude code code looks like it does.
- smt88 1 hour ago
  If Microsoft is your example of "brand death," Anthropic is dreaming of that kind of wild success and shouldn't care about its brand at all
- love2read 2 hours ago
  > they risk brand death like Microsoft
  Is Microsoft (one of the largest companies in the world) really a victim of brand death?
  [-]
  - mplewis 1 hour ago
    have you ever met a person who likes outlook?
    [-]
    - whobre 1 hour ago
      Anyone who’s ever tried Lotus Notes.
    - guzfip 1 hour ago
      No but I know oh so many forced to use it regardless.
totalmarkdown 1 hour ago
i upgraded to the 20x plan, and hit the weekly limit within 24 hours. i was running some fairly large tasks, but was still surprised it hit the weekly session limit so quickly. now i can't use it for 6 more days :( i didn't even have time to ask it to help setup logs or something to track my usage before i hit the session limit.
[-]
- abroszka33 1 hour ago
  How are you using it to reach the limit so quickly? I'm 13% at a 10x plan and I have used it for hours every day for the last 5 days. I never hit a limit.
  [-]
  - skerit 1 hour ago
    I have the 20x plan and use it together with my husband. 4 days in to our weekly usage window and we're only at 54% (and we both use it the entire day)
    I have no idea how people are hitting the limits so fast.
  - cvdub 1 hour ago
    Hitting limits is more related to how many tokens it’s generating, not necessarily how complex the changes are.
    Hit the weekly limit on my 20x plan last week trying to do a full front end rewrite of a giant enterprise web app, 600+ html templates, plus validating every single one with playwright.
    [-]
    - abroszka33 13 minutes ago
      That's definitely a lot of work. That consuming a week worth of tokens sounds reasonable.
  - vanchor3 50 minutes ago
    It seems like Cowork can easily chew through a few percent at a time, more if it gets lost in the weeds.
jostmey 1 hour ago
15000 milliseconds! Makes me laugh. I've had the same issue! Usually happens in the morning
whicks 2 hours ago
IME this isn't just a 'Claude Code' problem, I'm seeing extremely degraded / unresponsive performance using Opus 4.6 in Cursor.
[-]
- smt88 1 hour ago
  The status page indicates issues on almost all services
varispeed 21 minutes ago
I found that telling Claude that it is trying to defraud you and making spend money often gets it back on track and return to pervious performance briefly until it agains starts doing nonsense.
I think Anthropics model has conflict of interest. They seem to have nerfed the models so that it takes more iterations to get the result (and spend more money) than it used to where e.g. Opus would get something right first time.
tomasphan 2 hours ago
98% uptime is not great. Our eng department is thinking about going half half with Codex but of course there’s a switching cost.
[-]
- prabal97 21 minutes ago
  FYI I use my Codex models with Claude code and they work pretty great. It can even pick up on existing conversations w/ Opus and then resume w/ OAI models.
- tornikeo 1 hour ago
  I'm VERY curious about your case. What kind of switching costs do you guys have? I'm working at a very young startup that is still not locked into either AI provider harnesses -- what causes switching costs, just the subscription leftovers or something else?
  [-]
  - p_stuart82 44 minutes ago
    subscription leftovers are noise. the real switching cost is the harness glue.
    prompts. tool calling quirks. evals. auth. retries. all the weird failure modes your team already paid to learn.
HoldOnAMinute 2 hours ago
I solved this by upgrading Claude Code, closing down all instances, closing my browser, starting claude again, and doing a /login
[-]
- stronglikedan 2 hours ago
  I solved this by upgrading Claude Code, closing down all instances, closing my browser, and starting Codex
- reluctant_dev 2 hours ago
  This resolved it for me as well but not sure if this was just a timing thing.
- csomar 2 hours ago
  Yes, an upgraded Claude Code instance telepathically improve Claude back-end servers.
  [-]
  - giwook 2 hours ago
    LOL telepathy!
    It's actually via quantum entanglement.
jollymonATX 1 hour ago
Simply put, Anthropic does not have enough compute.
baq 2 hours ago
Not sure how Claude and CC has become the defacto best model given gpt 5.3 codex and 5.4 exist. This space moves so fast you should be testing your workflows on different models at least once every quarter, prudently once a month.
[-]
- Quothling 2 hours ago
  We've got access to opus 4-6, gpt 5.4, gemini pro and a few others through corprate. I have customized agents on claude, gpt and gemini since we tend to run out of tokens for x model by the end of a month. Out of all of them I've consistently been using sonnet for most tasks. Opus functions mainly as hand-off agents and reviewer". In my anecdotal experience Claude is miles ahead of the other models and has been for a long while... when it comes to writing code the way we want it. Which eksplicit, no-abstraction, no-external packages, fail fast defensive programming. I imagine you'd get different milage with different models and different coding styles.
  The rest of the organisation, which is not software development or IT related, mainly uses GPT models. I just wish I hadn't taught risk management about claude code so they weren't wasting MY tokens.
- fakwandi_priv 1 hour ago
  I've been an avid fan of codex for the last few month's but finally hit the weekly limit so I've wanted to try out claude code before biting the bullet and going for the 200 dollar codex sub.
  Obviously in hindsight it would be unfair to Anthropic to judge them on an unstable day so I'l leave those complaints aside but I hit the session limit way too fast. I planned out 3 tasks and it couldn't finish the first plan completely, for that implementation task it has seen a grand total of 1 build log and hasn't even run any tests which already caused it to enter in the red territory of the context circle.
  It was even asking me during planning which endpoints the new feature should use to hook into the existing system, codex would never ask this and just simply look these up during planning and whenever it encounters ambiguity it would either ask straight away or put it as an open question. I have to wonder if they're limiting this behavior due trying to keep the context as small as possible and preventing even earlier session limits.
  Maybe codex's limits are not sustainable in the long run and I'm very spoiled by the limits but at this point CC(sonnet) and Codex(5.4) are simply not in the same league when comparing both 20 dollar subscriptions.
  I will also clearly state that the value both these tools provide at these price points are absolutely worth it, it's just that codex's value/money ratio is much better.
- m-schuetz 2 hours ago
  Checking different models once every quarter is exactly what made me move to claude code.
  [-]
  - skippyboxedhero 1 hour ago
    Anthropic models haven't been far ahead for a while. Quite a few months at least. Chinese models are roughly equal at 1/6th the cost. Minimax is roughly equal to Opus. Chinese providers also haven't had the issues with uptime and variable model quality. The gap with OpenAI also isn't huge and GLM is a noticeably more compliant model (unsurprisingly given the hubristic internal culture at Anthropic around safety).
    CC is a better implementation and seems to be fairly economic with token usage. That is the really the only defining point and, I suspect, Anthropic are going to have a lot of trouble staying relevant with all the product issues.
    They were far ahead for a brief period in November/December which is driving the hype cycle that now appears to be collapsing the company.
    You have to test at least every month, things are moving quickly. Stepfun is releasing soon and seems to have an Opus-level model with more efficient architecture.
    [-]
    - nwienert 1 hour ago
      Minimax is nowhere near Opus in my tests, though for me at least oddly 4.6 felt worse than 4.5. I haven't use Minimax extensively, but I have an API driven test suite for a product and even Sonnet 4.6 outperforms it in my testing unless something changed in the last month.
      One example is I have a multi-stage distillation/knowledge extraction script for taking a Discord channel and answering questions. I have a hardcoded 5k message test set where I set up 20 questions myself based on analyzing it.
      In my harness Minimax wasn't even getting half of them right, whereas Sonnet was 100%. Granted this isn't code, but my usage on pi felt about the same.
    - epistasis 1 hour ago
      > CC is a better implementation and seems to be fairly economic with token usage. That is the really the only defining point and, I suspect, Anthropic are going to have a lot of trouble staying relevant with all the product issues.
      What are you using to drive the Chinese models in order to evaluate this? OpenCode?
      Some of Claude Code's features, like remote sessions, are far more important than the underlying model for my productivity.
    - SkyPuncher 1 hour ago
      Claude is exceptionally better at long running agentic sessions.
      I keep coming back to it because I can run it as a manager for the smaller tasks.
world2vec 2 hours ago
I'm getting "Prompt is too long" a lot today
postalcoder 1 hour ago
I stopped using Claude Code several months ago and I can't say I've missed it.
There was constant drama with CC. Degradation, low reliability, harness conspiring against you, and etc – these things are not new. Its burgeoning popularity has only made it worse. Anthropic is always doing something to shoot themselves in the foot.
The harness does cool things, don't get me wrong. But it comes with a ton of papercuts that don't belong in a professional product.
[-]
- djmips 1 hour ago
  Back to artisan all natural intelligence coding?
LoganDark 43 minutes ago
This was an outage.
mring33621 2 hours ago
For a lot of my work, I'm pretty happy with OpenCode + GLM-4.7-Flash-REAP-23B-A3B-Q4_K_M.gguf running in llama.cpp.
Free and local.
dude250711 2 hours ago
How is coding "solved" then?
Unless they meant "all code that needs to be written has already been written" so their mission is to prevent any new code from being written via a kind of a bait and switch?
mikkupikku 1 hour ago
I really don't understand the way Claude does rate limiting, particularly the 5 hour limit. I can get on at 11:30, blow through my limit doing some stupid shit like processing a pile of books into my llm-wiki, and then get notified that I've used 90% of my 5 hour session limit and I have to wait for noon (aka wait 10 minutes) for the five hour limit to reset. Baffling.
guzfip 1 hour ago
Anyone played much with Jetbrain’s LLM agent?
I’ve been toying around at home with it and I’ve been fine with its output mostly (in a Java project ofc), but I’ve run into a few consistent problems
- The thing always trips up validating its work. It consistently tries to use powershell in a WSL environment I don’t have it installed in. It also seems to struggle with relative/absolute paths when running commands.
- Pricing makes no sense to me, but Jetbrains offering seems to have its own layer of abstraction in “credits” that just seem so opaque.
Then again, I mostly use this stuff for implementing tedious utilities/features. I’m not doing entity agent written and still do a lot of hand tweaks to code, because it’s still faster to just do it myself sometimes. Mostly all from all from the IDE still.
nprateem 2 hours ago
Antigravity has become near unusable too for the last week with Opus. Continual capacity alerts meaning tasks stop running.
Not worth the money now, will be canceling unless fixed soon.
arduanika 2 hours ago
The eternal return of https://xkcd.com/303/
nurettin 2 hours ago
It started again.
rvz 1 hour ago
Claude is now making itself unavailable after it was on vacation yesterday.
Maybe you should consider....local models instead?
surcap526 1 minute ago
[dead]
techpulselab 1 hour ago
[dead]
maxothex 1 hour ago
[dead]
phengze 1 hour ago
[dead]
honeycrispy 2 hours ago
The solution is clearly more vibe coding at anthropic.
I doubt even the core engineers know how to begin debugging that spaghetti code.
[-]
- Lionga 2 hours ago
  correct proompt is:"you are a senior engineer. fix issues. NO hallucinations this time. PRETTY PLEASE"
  [-]
  - cube00 1 hour ago
    Needs more bold CRITICAL and some ultra-think
  - mring33621 2 hours ago
    You forgot the "No Mistakes!" clause
  - gedy 1 hour ago
    You missed: "Simon says:"
ai_slop_hater 2 hours ago
Codex is pretty good, and it is written in Rust.
[-]
- MeetingsBrowser 2 hours ago
  I’m a big fan of Rust, but the frontend being written in Rust doesn’t help a ton with backend issues unfortunately.
  [-]
  - ai_slop_hater 2 hours ago
    Maybe, but you can literally feel the difference as you type. When you type in Codex, it's fast, it feels instant. When you type in Claude Code, it feels like playing a game in 60 fps after you already got used to 144 fps.
- ramon156 2 hours ago
  Codex does not go well with my Zellij/Alacritty setup. It does not respect resize events. Opencode is nice, though
- thefourthchime 2 hours ago
  5.4 is smarter than Opus when it comes to really figuring out a problem. Codex agentic stuff takes forever though.
- isatty 2 hours ago
  Something being written in rust has no bearing to whether it’s good. You can create slop in any language.
  [-]
  - 16bitvoid 1 hour ago
    At least the slop is fast and jitter-free, and not using React in a terminal.
- jghn 2 hours ago
  > and it is written in Rust.
  So?