huggingface 数据集/模型库 上传和下载
环境准备
云原生时代,golang为王; 而AI时代,Python为王。 需要安装Python相关的环境和libary包;
python虚拟环境
由于 Python 包管理和依赖管理本身问题,不同项目对同一个包的不同版本的依赖不同时,会导致依赖混乱、不同版本包冲突等,多项目环境不可用等问题
因此,python3 提出 虚拟环境
概念, 从开发环境多目录开始
,屏蔽依赖包管理薄弱问题;
首次创建:
# 安装virtualenv
$ pip install virtualenv
# 在 ~/ 里面创建虚拟环境 .env
$ cd ~/
$ virtualenv .env
# 切换进入 .env/ 这个虚拟环境
$ source .env/bin/activate
以后只要切换虚拟环境
# 切换进入 .env/ 这个虚拟环境
jiangdong@Mac Mini:~ $ source .env/bin/activate
(.env) jiangdong@Mac Mini:~ $
安装huggingface hub相关 python 包
jiangdong@Mac Mini:~ $ source .env/bin/activate
(.env) jiangdong@Mac Mini:~ $
# 安装 huggingface hub 主包
jiangdong@Mac Mini:~ $ pip install -U "huggingface_hub"
Requirement already satisfied: huggingface_hub in ./.env/lib/python3.13/site-packages (0.28.1)
Requirement already satisfied: filelock in ./.env/lib/python3.13/site-packages (from huggingface_hub) (3.17.0)
Requirement already satisfied: fsspec>=2023.5.0 in ./.env/lib/python3.13/site-packages (from huggingface_hub) (2025.2.0)
Requirement already satisfied: packaging>=20.9 in ./.env/lib/python3.13/site-packages (from huggingface_hub) (24.2)
Requirement already satisfied: pyyaml>=5.1 in ./.env/lib/python3.13/site-packages (from huggingface_hub) (6.0.2)
Requirement already satisfied: requests in ./.env/lib/python3.13/site-packages (from huggingface_hub) (2.32.3)
Requirement already satisfied: tqdm>=4.42.1 in ./.env/lib/python3.13/site-packages (from huggingface_hub) (4.67.1)
Requirement already satisfied: typing-extensions>=3.7.4.3 in ./.env/lib/python3.13/site-packages (from huggingface_hub) (4.12.2)
Requirement already satisfied: charset-normalizer<4,>=2 in ./.env/lib/python3.13/site-packages (from requests->huggingface_hub) (3.4.1)
Requirement already satisfied: idna<4,>=2.5 in ./.env/lib/python3.13/site-packages (from requests->huggingface_hub) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in ./.env/lib/python3.13/site-packages (from requests->huggingface_hub) (2.3.0)
Requirement already satisfied: certifi>=2017.4.17 in ./.env/lib/python3.13/site-packages (from requests->huggingface_hub) (2025.1.31)
# 安装huggingface_hub[cli]包
(.env) jiangdong@Mac Mini:~ $ pip install -U "huggingface_hub[cli]"
Requirement already satisfied: huggingface_hub[cli] in ./.env/lib/python3.13/site-packages (0.28.1)
Requirement already satisfied: filelock in ./.env/lib/python3.13/site-packages (from huggingface_hub[cli]) (3.17.0)
Requirement already satisfied: fsspec>=2023.5.0 in ./.env/lib/python3.13/site-packages (from huggingface_hub[cli]) (2025.2.0)
Requirement already satisfied: packaging>=20.9 in ./.env/lib/python3.13/site-packages (from huggingface_hub[cli]) (24.2)
Requirement already satisfied: pyyaml>=5.1 in ./.env/lib/python3.13/site-packages (from huggingface_hub[cli]) (6.0.2)
Requirement already satisfied: requests in ./.env/lib/python3.13/site-packages (from huggingface_hub[cli]) (2.32.3)
Requirement already satisfied: tqdm>=4.42.1 in ./.env/lib/python3.13/site-packages (from huggingface_hub[cli]) (4.67.1)
Requirement already satisfied: typing-extensions>=3.7.4.3 in ./.env/lib/python3.13/site-packages (from huggingface_hub[cli]) (4.12.2)
Requirement already satisfied: InquirerPy==0.3.4 in ./.env/lib/python3.13/site-packages (from huggingface_hub[cli]) (0.3.4)
Requirement already satisfied: pfzy<0.4.0,>=0.3.1 in ./.env/lib/python3.13/site-packages (from InquirerPy==0.3.4->huggingface_hub[cli]) (0.3.4)
Requirement already satisfied: prompt-toolkit<4.0.0,>=3.0.1 in ./.env/lib/python3.13/site-packages (from InquirerPy==0.3.4->huggingface_hub[cli]) (3.0.50)
Requirement already satisfied: charset-normalizer<4,>=2 in ./.env/lib/python3.13/site-packages (from requests->huggingface_hub[cli]) (3.4.1)
Requirement already satisfied: idna<4,>=2.5 in ./.env/lib/python3.13/site-packages (from requests->huggingface_hub[cli]) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in ./.env/lib/python3.13/site-packages (from requests->huggingface_hub[cli]) (2.3.0)
Requirement already satisfied: certifi>=2017.4.17 in ./.env/lib/python3.13/site-packages (from requests->huggingface_hub[cli]) (2025.1.31)
Requirement already satisfied: wcwidth in ./.env/lib/python3.13/site-packages (from prompt-toolkit<4.0.0,>=3.0.1->InquirerPy==0.3.4->huggingface_hub[cli]) (0.2.13)
# 安装 huggingface_hub[hf_transfer] 包
(.env) jiangdong@Mac Mini:~ $ pip install -U "huggingface_hub[hf_transfer]"
Requirement already satisfied: huggingface_hub[hf_transfer] in ./.env/lib/python3.13/site-packages (0.28.1)
Requirement already satisfied: filelock in ./.env/lib/python3.13/site-packages (from huggingface_hub[hf_transfer]) (3.17.0)
Requirement already satisfied: fsspec>=2023.5.0 in ./.env/lib/python3.13/site-packages (from huggingface_hub[hf_transfer]) (2025.2.0)
Requirement already satisfied: packaging>=20.9 in ./.env/lib/python3.13/site-packages (from huggingface_hub[hf_transfer]) (24.2)
Requirement already satisfied: pyyaml>=5.1 in ./.env/lib/python3.13/site-packages (from huggingface_hub[hf_transfer]) (6.0.2)
Requirement already satisfied: requests in ./.env/lib/python3.13/site-packages (from huggingface_hub[hf_transfer]) (2.32.3)
Requirement already satisfied: tqdm>=4.42.1 in ./.env/lib/python3.13/site-packages (from huggingface_hub[hf_transfer]) (4.67.1)
Requirement already satisfied: typing-extensions>=3.7.4.3 in ./.env/lib/python3.13/site-packages (from huggingface_hub[hf_transfer]) (4.12.2)
Requirement already satisfied: hf-transfer>=0.1.4 in ./.env/lib/python3.13/site-packages (from huggingface_hub[hf_transfer]) (0.1.9)
Requirement already satisfied: charset-normalizer<4,>=2 in ./.env/lib/python3.13/site-packages (from requests->huggingface_hub[hf_transfer]) (3.4.1)
Requirement already satisfied: idna<4,>=2.5 in ./.env/lib/python3.13/site-packages (from requests->huggingface_hub[hf_transfer]) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in ./.env/lib/python3.13/site-packages (from requests->huggingface_hub[hf_transfer]) (2.3.0)
Requirement already satisfied: certifi>=2017.4.17 in ./.env/lib/python3.13/site-packages (from requests->huggingface_hub[hf_transfer]) (2025.1.31)
# 安装 transformers 由于调用数据集、model加载等能力
(.env) jiangdong@Mac Mini:~ $ pip install -U "transformers"
Requirement already satisfied: transformers in ./.env/lib/python3.13/site-packages (4.48.3)
Requirement already satisfied: filelock in ./.env/lib/python3.13/site-packages (from transformers) (3.17.0)
Requirement already satisfied: huggingface-hub<1.0,>=0.24.0 in ./.env/lib/python3.13/site-packages (from transformers) (0.28.1)
Requirement already satisfied: numpy>=1.17 in ./.env/lib/python3.13/site-packages (from transformers) (2.2.2)
Requirement already satisfied: packaging>=20.0 in ./.env/lib/python3.13/site-packages (from transformers) (24.2)
Requirement already satisfied: pyyaml>=5.1 in ./.env/lib/python3.13/site-packages (from transformers) (6.0.2)
Requirement already satisfied: regex!=2019.12.17 in ./.env/lib/python3.13/site-packages (from transformers) (2024.11.6)
Requirement already satisfied: requests in ./.env/lib/python3.13/site-packages (from transformers) (2.32.3)
Requirement already satisfied: tokenizers<0.22,>=0.21 in ./.env/lib/python3.13/site-packages (from transformers) (0.21.0)
Requirement already satisfied: safetensors>=0.4.1 in ./.env/lib/python3.13/site-packages (from transformers) (0.5.2)
Requirement already satisfied: tqdm>=4.27 in ./.env/lib/python3.13/site-packages (from transformers) (4.67.1)
Requirement already satisfied: fsspec>=2023.5.0 in ./.env/lib/python3.13/site-packages (from huggingface-hub<1.0,>=0.24.0->transformers) (2025.2.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in ./.env/lib/python3.13/site-packages (from huggingface-hub<1.0,>=0.24.0->transformers) (4.12.2)
Requirement already satisfied: charset-normalizer<4,>=2 in ./.env/lib/python3.13/site-packages (from requests->transformers) (3.4.1)
Requirement already satisfied: idna<4,>=2.5 in ./.env/lib/python3.13/site-packages (from requests->transformers) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in ./.env/lib/python3.13/site-packages (from requests->transformers) (2.3.0)
Requirement already satisfied: certifi>=2017.4.17 in ./.env/lib/python3.13/site-packages (from requests->transformers) (2025.1.31)
# 最后pip check是否安装可用
(.env) jiangdong@Mac Mini:~ $ pip list
Package Version
------------------ ---------
certifi 2025.1.31
charset-normalizer 3.4.1
filelock 3.17.0
fsspec 2025.2.0
hf_transfer 0.1.9
huggingface-hub 0.28.1
idna 3.10
inquirerpy 0.3.4
Jinja2 3.1.5
MarkupSafe 3.0.2
mpmath 1.3.0
networkx 3.4.2
numpy 2.2.2
packaging 24.2
pfzy 0.3.4
pip 25.0.1
prompt_toolkit 3.0.50
PyYAML 6.0.2
regex 2024.11.6
requests 2.32.3
safetensors 0.5.2
setuptools 75.8.0
sgl-kernel 0.0.1
sympy 1.13.1
tokenizers 0.21.0
torch 2.6.0
tqdm 4.67.1
transformers 4.48.3
typing_extensions 4.12.2
urllib3 2.3.0
wcwidth 0.2.13
数据集 和 模型 推送和下载
环境配置
配置 huggingface hub 环境变量
# .zshrc 或者 .bashrc
$ cat .zshrc
# 用于tranformer 端点续传
export HF_HUB_ENABLE_HF_TRANSFER=1
# 下载和推送超时设置
export HF_HUB_ETAG_TIMEOUT=86400
export HF_HUB_DOWNLOAD_TIMEOUT=86400
# 国内 mirror镜像源
export HF_ENDPOINT=https://hf-mirror.com
# huggingface token: read token 用于pull; write token 用于push; 管理token支持组织更改、删除等
export HUGGING_FACE_HUB_TOKEN=xxxxxxxxxxxxxxxxxxxxxxxxxxx
(.env) jiangdong@Mac Mini:tmp $ source ~/.zshrc
(.env) jiangdong@Mac Mini:tmp $ env
...
HF_HUB_ENABLE_HF_TRANSFER=1
HF_HUB_ETAG_TIMEOUT=86400
HF_HUB_DOWNLOAD_TIMEOUT=86400
HF_ENDPOINT=https://hf-mirror.com
HUGGING_FACE_HUB_TOKEN=xxxxxxxxxxxxxxxxxxxxxxxxxxx
...
cli是否可用
# 安装git 和 git lfs
(.env) jiangdong@Mac Mini:~ $ brew install git
Warning: git 2.48.1 is already installed and up-to-date.
To reinstall 2.48.1, run:
brew reinstall git
(.env) jiangdong@Mac Mini:~ $ brew install git-lfs
Warning: git-lfs 3.6.1 is already installed and up-to-date.
To reinstall 3.6.1, run:
brew reinstall git-lfs
# cli 可用
(.env) jiangdong@Mac Mini:~ $ huggingface-cli -h
usage: huggingface-cli <command> [<args>]
positional arguments:
{download,upload,repo-files,env,login,whoami,logout,auth,repo,lfs-enable-largefiles,lfs-multipart-upload,scan-cache,delete-cache,tag,version,upload-large-folder}
huggingface-cli command helpers
download Download files from the Hub
upload Upload a file or a folder to a repo on the Hub
repo-files Manage files in a repo on the Hub
env Print information about the environment.
login Log in using a token from huggingface.co/settings/tokens
whoami Find out which huggingface.co account you are logged in as.
logout Log out
auth Other authentication related commands
repo {create} Commands to interact with your huggingface.co repos.
lfs-enable-largefiles
Configure your repository to enable upload of files > 5GB.
scan-cache Scan cache directory.
delete-cache Delete revisions from the cache directory.
tag (create, list, delete) tags for a repo in the hub
version Print information about the huggingface-cli version.
upload-large-folder
Upload a large folder to a repo on the Hub
options:
-h, --help show this help message and exit
(.env) jiangdong@Mac Mini:~ $ huggingface-cli version
huggingface_hub version: 0.28.1
dateset下载和推送
下载:
jiangdong@Mac Mini:tmp $ huggingface-cli download --repo-type dataset simplescaling/s1K --local-dir s1k
Downloading '.gitattributes' to '/Users/jiangdong/.cache/huggingface/hub/datasets--simplescaling--s1K/blobs/1ef325f1b111266a6b26e0196871bd78baa8c2f3.incomplete'
.gitattributes: 2.46kB [00:00, 5.79MB/s]
Download complete. Moving file to /Users/jiangdong/.cache/huggingface/hub/datasets--simplescaling--s1K/blobs/1ef325f1b111266a6b26e0196871bd78baa8c2f3
Downloading 'README.md' to '/Users/jiangdong/.cache/huggingface/hub/datasets--simplescaling--s1K/blobs/099326cf6f2575e2302bba53675444bcfdd6eb07.incomplete'
README.md: 22.7kB [00:00, 26.9MB/s]
Download complete. Moving file to /Users/jiangdong/.cache/huggingface/hub/datasets--simplescaling--s1K/blobs/099326cf6f2575e2302bba53675444bcfdd6eb07
train-00000-of-00001.parquet: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6.88M/6.88M [01:53<00:00, 60.6kB/s]
Download complete. Moving file to /Users/jiangdong/.cache/huggingface/hub/datasets--simplescaling--s1K/blobs/899de0fb79be8465efb311aec94c4dcf9863c72684610b4626a8dacef2c2d2e7
/Users/jiangdong/.cache/huggingface/hub/datasets--simplescaling--s1K/snapshots/278d72baaa2b887a7e76a70a0ae254a5a45536e4
PS: 也支持git clone 但是磁盘空间会使用2倍
大小,其中包括 .git 多版本内容
推送:
jiangdong@Mac Mini:tmp $ cd s1k
jiangdong@Mac Mini:s1k $ huggingface-cli upload s1k-cupy . . --repo-type dataset
Start hashing 3 files.
Finished hashing 3 files.
train-00000-of-00001.parquet: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6.88M/6.88M [00:06<00:00, 1.02MB/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:07<00:00, 7.08s/it]
Removing 1 file(s) from commit that have not changed.
https://hf-mirror.com/datasets/DONGJINAG/s1k-cupy/tree/main/.
model下载和推送
下载:
jiangdong@Mac Mini:tmp $ huggingface-cli download deepseek-ai/Janus-1.3B --local-dir Janus-1.3B
Downloading '.gitattributes' to 'Janus-1.3B/.cache/huggingface/download/wPaCkH-WbT7GsmxMKKrNZTV4nSM=.a6344aac8c09253b3b630fb776ae94478aa0275b.incomplete'
.gitattributes: 1.52kB [00:00, 3.36MB/s]
Download complete. Moving file to Janus-1.3B/.gitattributes
Downloading 'README.md' to 'Janus-1.3B/.cache/huggingface/download/Xn7B-BWUGOee2Y6hCZtEhtFu4BE=.44e58a85f10a1aa0f43442501f8151cd16259516.incomplete'
README.md: 2.96kB [00:00, 4.76MB/s]
Download complete. Moving file to Janus-1.3B/README.md
Downloading 'arch.jpg' to 'Janus-1.3B/.cache/huggingface/download/sF5KJ0gbkGHoKLZwGTyqVDlXH78=.16b5a62960433000444996af47a63979016aa39f.incomplete'
arch.jpg: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 250k/250k [00:00<00:00, 250kB/s]
Download complete. Moving file to Janus-1.3B/arch.jpg
Downloading 'config.json' to 'Janus-1.3B/.cache/huggingface/download/8_PA_wEVGiVa2goH2H4KQOQpvVY=.ae9d81cc1bb235f4e91a4f87c98152e44306036f.incomplete'
config.json: 1.45kB [00:00, 4.85MB/s]
Download complete. Moving file to Janus-1.3B/config.json
...
推送:
jiangdong@Mac Mini:tmp $ cd Janus-1.3B
jiangdong@Mac Mini:Janus-1.3B $ huggingface-cli upload Janus-1.3B-copy. . --repo-type model
...
「如果这篇文章对你有用,请随意打赏」
如果这篇文章对你有用,请随意打赏
使用微信扫描二维码完成支付
