Commit History

Author SHA1 Message Date
  Alexander Borzunov a26559ff65 Fix `.generate(input_ids=...)` (#485) 1 year ago
  Alexander Borzunov 329f7d31e8 Add `blocked_servers` argument (#462) 1 year ago
  Alexander Borzunov 056f22515a Prioritize short inference, unmerge pools for long inference (#458) 1 year ago
  Alexander Borzunov 8c546d988a Test Llama, rebalancing, throughput eval, and all CLI scripts (#452) 1 year ago
  Alexander Borzunov de930918a0 Support loading blocks in 4-bit (QLoRA NF4 format, disabled by default) (#333) 1 year ago
  Alexander Borzunov cb3f018f9f Add LLaMA support (#323) 1 year ago
  Alexander Borzunov 8f6342a861 Refactor RemoteSequenceManager (#309) 1 year ago
  Alexander Borzunov 21c3526ec1 Start SequenceManager's thread only after first .make_sequence() (#301) 1 year ago
  Max Ryabinin 793726b041 Speed up loading blocks using init with meta weights (#285) 1 year ago
  Alexander Borzunov fee19e9b9b Use get_logger(__name__) instead of get_logger(__file__) (#265) 1 year ago
  justheuristic ae9e71fe8e Add local tensor-parallel fwd/bwd (#143) 1 year ago
  Alexander Borzunov 668b736031 Fix logging: do not duplicate lines, enable colors in Colab (#156) 1 year ago
  justheuristic a2066a4096 Optimize RemoteSequenceManager (#106) 1 year ago
  Alexander Borzunov 43ac6016ac Fix dtypes in backend schemas (#99) 1 year ago
  Alexander Borzunov 7bd5916744 Make Petals a pip-installable package (attempt 2) (#102) 1 year ago
  Dmitry Baranchuk 6095f58681 Deep distributed prompt tuning (#42) 2 years ago
  justheuristic f0c7383181 Implement RemoteSequential slicing and extra repr, add tests (#30) 2 years ago