split

by BytesAgain

Data splitting techniques and strategies reference — partitioning datasets, string splitting, file splitting, and ML train/test splits. Use when dividing data, chunking files, or designing data partitioning strategies.

View Chinese version with editor review

安装

claude skill add --url github.com/openclaw/skills/tree/main/skills/bytesagain1/split

文档

Split — Data Splitting Reference

Quick-reference skill for data splitting techniques, partitioning strategies, and practical patterns.

When to Use

  • Splitting strings by delimiters, patterns, or fixed widths
  • Partitioning datasets for ML training/validation/test
  • Dividing large files into manageable chunks
  • Database sharding and horizontal partitioning
  • Understanding split strategies for distributed systems

Commands

intro

bash
scripts/script.sh intro

Overview of data splitting — concepts, common use cases, and terminology.

string

bash
scripts/script.sh string

String splitting techniques — delimiters, regex, fixed-width, tokenization.

file

bash
scripts/script.sh file

File splitting methods — by size, lines, patterns, and round-robin.

dataset

bash
scripts/script.sh dataset

ML dataset splitting — train/val/test, stratified, time-series, k-fold.

database

bash
scripts/script.sh database

Database partitioning — horizontal, vertical, hash, range, and list.

strategies

bash
scripts/script.sh strategies

Splitting strategies for distributed systems — consistent hashing, sharding keys.

examples

bash
scripts/script.sh examples

Practical split examples across languages and tools.

pitfalls

bash
scripts/script.sh pitfalls

Common pitfalls and best practices when splitting data.

help

bash
scripts/script.sh help

version

bash
scripts/script.sh version

Configuration

VariableDescription
SPLIT_DIRData directory (default: ~/.split/)

Powered by BytesAgain | bytesagain.com | hello@bytesagain.com