12 comments

  • PhilippGille 9 minutes ago
    How does it compare to Onyx (rebranded from Danswer, with more chat focus, while Danswer was more RAG focus on company docs/comms)?

    - https://onyx.app/

    - Their rebranded Onyx launch: https://news.ycombinator.com/item?id=46045987

    - Their orignal Danswer launch: https://news.ycombinator.com/item?id=36667374

  • acidburnNSA 53 minutes ago
    * "Self-hosted: Runs entirely on your infrastructure. No data leaves your network."

    * "Bring Your Own LLM: Anthropic, OpenAI, Gemini, or open-weight models via vLLM."

    With so many newbies wanting these kinds of services it might be worth adjusting the first bullet to say: "No data leaves your network, at least as long as you don't use any Anthropic, OpenAI, or Gemini models via the network of course"

    • prvnsmpth 32 minutes ago
      That's a good point, it might make sense to clarify that for individuals who want to self-host. I'll make the change, thanks!
    • cjonas 50 minutes ago
      Most organizations are going to be self hosting on aws, gcp or azure... So as long as you use their inference services as your LLM then you can keep it all within the private network
      • acidburnNSA 30 minutes ago
        Even self-hosting on AWS, GCP, or Azure isn't local enough for certain application, such as people doing export-controlled work where any sysadmin or person with physical access to the server/data is required to be a US Person (or equivalent in other countries). This is the niche that the govcloud solutions are aimed at serving. But some people just want to build big actually-private, actually self-hosted systems and do their own physical and network security.
      • prvnsmpth 40 minutes ago
        Exactly, enterprise customers almost always use private model endpoints on their cloud provider for any serious deployments. Data stays within the customer's VPC, data security and privacy is guaranteed by the cloud providers.
  • Doublon 3 hours ago
    Interesting!

    I also started to build something similar for us, as an PoC/alternative to Glean. I'm curious how you handle data isolation, where each user has access to just the messages in their own Slack channels, or Jira tickets from only workspaces they have access to? Managing user mapping was also super painful in AWS Q for Business.

    • prvnsmpth 3 hours ago
      Thank you!

      Currently permissions are handled in the app layer - it's simply a WHERE clause filter that restricts access to only those records that the user has read permissions for in the source. But I plan to upgrade this to use RLS in Postgres eventually.

      For Slack specifically, right now the connector only indexes public channels. For private channels, I'm still working on full permission inheritance - capturing all channel members, and giving them read permissions to messages indexed from that channel. It's a bit challenging because channel members can change over time, and you'll have to keep permissions updated in real-time.

  • jFriedensreich 14 minutes ago
    Can we please not change the meaning of chat to mean agent interface? It was painful to see crypto suddenly meaning token instead if cryptography. Plus i really dont want to “chat” with ai. its a textual interface
  • Lapalux 55 minutes ago
    Can it connect to Teams?
    • patates 43 minutes ago
      Tangeant: Why is integrating with teams SO difficult?

      I started parsing its system logs to create entries in our system automatically to book my times - just not todeal with their silly REST api requirements.

    • prvnsmpth 51 minutes ago
      Not yet, there’s a Microsoft connector implementation, but it only does Sharepoint, OneDrive, Outlook etc. and I haven’t tested it thoroughly yet. Teams required some special setup to work IIRC, so I skipped it. Will keep it on the roadmap though!
  • andai 57 minutes ago
    Nice! Could you elaborate on "not just a basic RAG"?
    • prvnsmpth 47 minutes ago
      Thank you!

      Typical RAG implementations I’ve seen take the user query and directly run it against the full-text search and embedding indexes. This produces sub-par results because the query embedding doesn’t really capture fully what the user is really looking for.

      A better solution is to send the user query to the LLM, and let it construct and run queries against the index via tool calling. Nothing too ground-breaking tbh, pretty much every AI search agent does this now. But it produces much better results.

  • swaminarayan 3 hours ago
    How well does the Postgres-only approach hold up as data grows — did you benchmark it against Elasticsearch or a dedicated vector DB?
    • prvnsmpth 3 hours ago
      I've done small scale experiments with up to 100-500k rows, and did not notice any significant degradation in search query latency - p95 still well under 1s.

      I haven't directly compared against Elasticsearch yet, but I plan to do that next and publish some numbers. There's a benchmark harness setup already: https://github.com/getomnico/omni/tree/master/benchmarks, but there's a couple issues with it right now that I need to address first before I do a large scale run (the ParadeDB index settings need some tuning).

    • cultofmetatron 1 hour ago
      we have a pretty intensively used postgres backed app handling thousands of users concurrently. After 6 years and thousands of paying custoners, we are only now approaching to the limits of what it can support on the horizon. TLDR: when you get there, you can hire some people to help you break things off as needed. if you're still trying to prove your business model and carve yoruself a segment of the market, just use postgres
      • prvnsmpth 1 hour ago
        Thanks for sharing! Big part of the reason why I decided on postgres, everything I've read about people using it in prod tells me that most organizations never really grow beyond requiring anything more than what it offers.
        • hobs 1 hour ago
          Most of the time just re-casting what you want in a horizontally shardable way is the "right" way to do it with any rdbms (if you scale) but at this point you can get boxes on AWS with 32TiB of ram, and most organizations don't have that much total data across their entire suite of stuff (many do, most don't.)
  • keyle 2 hours ago
    I've done some RAG using postgres and the vector db extension, look into it if you're doing that type of search; it's certainly simpler than bolting another solution for it.
    • prvnsmpth 2 hours ago
      Yeah, Omni uses Postgres and pgvector for search. ParadeDB is essentially just Postgres with the pgsearch extension that brings in Tantivy, a full-text search engine (like Apache Lucene).
  • vladdoster 3 hours ago
    Multiple pages link to a `API Reference` that returns a 404
    • prvnsmpth 2 hours ago
      Oops, sorry! That page is still a WIP, haven't pushed it yet. The plan was to expose the main search and chat APIs so that users can build integrations with third-party messaging apps (e.g. Slack), but haven't gotten around to properly documenting all the APIs yet.
  • octoclaw 3 hours ago
    [dead]
  • shablulman 3 hours ago
    [dead]