https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#head
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE
http://www.nanopub.org/nschema#hasAssertion
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE
http://www.nanopub.org/nschema#hasProvenance
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#provenance
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE
http://www.nanopub.org/nschema#hasPublicationInfo
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#pubinfo
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.nanopub.org/nschema#Nanopublication
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion
http://purl.org/dc/terms/creator
https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
https://schema.org/Claim
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
https://schema.org/Observation
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
https://schema.org/Question
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion
http://www.w3.org/2000/01/rdf-schema#comment
Scaling laws don't care about scale of the "train" models?
Did anyone else get this?
When I predict a scaling law, the scale of the largest model matters, but the num-models for fitting matters much much much more.
Initial results, scaling error by #models starting from largest https://twitter.com/LChoshen/status/1803401845626511568/photo/1
Maybe more simply put:
You can predict a scaling law with 8 small models, and it would be better than 3 large ones (that costs a lot)
Is that something anyone else seen?
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion
https://schema.org/keywords
AI
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion
https://schema.org/keywords
cost
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion
https://schema.org/keywords
initialresults
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion
https://schema.org/keywords
models
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion
https://schema.org/keywords
modelscale
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion
https://schema.org/keywords
scalinglaws
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion
https://schema.org/keywords
training
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#provenance
https://sense-nets.xyz/
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.w3.org/ns/prov#SoftwareAgent
https://sense-nets.xyz/
http://www.w3.org/ns/prov#actedOnBehalfOf
https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#activity
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
https://sense-nets.xyz/supervisedActivity
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#activity
http://www.w3.org/ns/prov#wasAssociatedWith
https://sense-nets.xyz/
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion
http://www.w3.org/ns/prov#linksTo
https://x.com/LChoshen/status/1803401845626511568
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion
http://www.w3.org/ns/prov#wasAssociatedWith
https://x.com/LChoshen
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion
http://www.w3.org/ns/prov#wasAttributedTo
https://orcid.org/0000-0002-0085-6496
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion
http://www.w3.org/ns/prov#wasAttributedTo
https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion
http://www.w3.org/ns/prov#wasGeneratedBy
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#activity
https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts
http://xmlns.com/foaf/0.1/account
https://orcid.org/0000-0002-0085-6496
https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts
http://xmlns.com/foaf/0.1/account
https://x.com/LChoshen
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#pubinfo
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#sig
http://purl.org/nanopub/x/hasAlgorithm
RSA
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#sig
http://purl.org/nanopub/x/hasPublicKey
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEArHtI92jm8pAYVsvJabxLGfOT+7G0JyJGh2gwjB5x2pFPga6wWTd+rNBWWUZViIFnaJrBEsJpgdnoupLU9ppwn+khMiGRfxqGsDDzwHcj3Jc75CRys7d3etwXdBdoXfBgjsJiZBazwm13idr6tljRrC1TaEJBnRQAqzBw9cLDeGY77cSznzXT39feUGT168dpCSE9O6u/48DvvWVqciHGsH9cQ+LroJJVsMrorwtsdZnAK+q48wtIP6pIpw5shSJ5LnA0qeN/f4TvTFDV6ItYIXjiWWpTECc/Bxmfnyat3B5xWCu9nvz8fEs7Ns0TuzQwT3/K55iSKDEIi/E0nO97xwIDAQAB
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#sig
http://purl.org/nanopub/x/hasSignature
kwO4GqIhQeFzJGBGWRP9n9T+melmnkrd/EaUCcLuZScYqLWjoRAdThFjYLDjPNDEUtX77Ddf46qBmfw4Ydm9ksvPfRIyKj78nGliWcWESn8zdbCyr6h/ldezO7psXGlWqi4FeyLKsvfBC3fjPZh24pteD1VKWOhL4X4gUYfE+W7aKklx5pM3WmXq0DQefbaQXHpyq3PeMFiUPbmC4O92iRO1k0izQ2KWkNSJXr1Q7q8nwcoran09uRPYam8NUwt+zU8t/NRS+bGRC702gLi32ZZFKN9Q9XcmwFHHazArKdSDqZQleq/aJhXvmtmlIkZWF+1geA3bknx6thSnwP99og==
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#sig
http://purl.org/nanopub/x/hasSignatureTarget
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#sig
http://purl.org/nanopub/x/singedBy
https://sense-nets.xyz/
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#sig
http://www.w3.org/ns/prov#wasAssociatedWith
https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16VtssigningDelegation
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE
http://purl.org/dc/terms/created
2024-09-13T18:09:57.099Z
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE
http://purl.org/dc/terms/creator
https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE
http://purl.org/dc/terms/license
https://creativecommons.org/licenses/by/4.0/
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE
http://purl.org/nanopub/x/hasNanopubType
https://sense-nets.xyz/SemanticPost
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE
http://purl.org/nanopub/x/wasCreatedAt
https://sense-nets.xyz/
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE
http://www.w3.org/2000/01/rdf-schema#label
CoSMO Semantic Post
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE
http://www.w3.org/ns/prov#wasAttributedTo
https://orcid.org/0000-0002-0085-6496
https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE
https://sense-nets.xyz/hasRootSigner
0xf6ECcfD463afB464dcC85b051DF2E93E2646E6D2
https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts
http://xmlns.com/foaf/0.1/account
https://orcid.org/0000-0002-0085-6496
https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts
http://xmlns.com/foaf/0.1/name
Leshem Choshen 🤖🤗 @ICML wanna talk?