Post by @Jage

Sign in
Sign up

EN

Log in Register

Modes

Overview Chat Timeline Communities Gallery Lists Friends Email Vault DNS VPN

Back to Timeline @Jage

Jage

@Jage@mas.to

Direct from Kalamazoo; some business, some tech, some very random stuff.

mas.to

Jage

@Jage@mas.to

Direct from Kalamazoo; some business, some tech, some very random stuff.

mas.to

@Jage@mas.to · Mar 24, 2026

SWE-rebench Leaderboard is a newer coding benchmark with a dataset that is continuously updated with modern tasks that LLM's presumably have not seen before. Claude Opus 4.6 ranks number 1 which probably is not surprising, but GPT 5.2 (not 5.4) and GLM 5 being 2nd and 3rd may.
https://swe-rebench.com/

1

0

1

Sign in to interact

Loading comments...

313k7r1n3

Company

About
Contact
FAQ

Legal

Terms of Service
Privacy Policy
VPN Policy

Email Settings

IMAP: mail.elektrine.com:993

POP3: pop3.elektrine.com:995

SMTP: mail.elektrine.com:465

SSL/TLS required

Support

support@elektrine.com
Report Security Issue

Connect

Tor Hidden Service

khav7sdajxu6om3arvglevskg2vwuy7luyjcwfwg6xnkd7qtskr2vhad.onion