Turn a Departing Engineer's Judgment Into an Editable, Versioned Skill File

In ~5 mins: what dot-skill actually generates, the capability/persona split, the S = (A, M, L) contract, the correct-and-roll-back loop, the behavioral-fidelity gap the paper admits, and a full install-to-rollback walkthrough at the end.

A tool that turns your coworker into an installable AI skill crossed roughly 18,500 GitHub stars.

dot-skill reads a person’s scattered work traces, their docs, code reviews, and chat decisions, and writes them into a skill file an agent can load.

The backlash arrived fast. Someone shipped an anti-distillation skill that adds noise to your own traces so you cannot be cleanly copied.

Now there is a paper. COLLEAGUE.SKILL, from Shanghai AI Lab, 29 May, drops the digital-twin pitch for a narrower claim: this is a file format for expertise, not a copy of a person.

近日，github上一个名叫“同事.skill”的项目火了。 4月3日，一博主表示，她开发了“反蒸馏skill”的项目。她表示，大家都是出来做牛马的，没人希望自己被做成skill，然后丢掉工作，所以自己发明了“反蒸馏skill”。希望大家在这个AI浪潮里都能活得久一点吧。

10:30 PM · Apr 3, 2026 · 949K Views

118 Replies · 718 Reposts · 5.28K Likes

The useful part is the artifact. Whether it keeps the judgment is the part nobody has measured.

dot-skill is open source under MIT, written in Python, at titanwings/colleague-skill. It started as colleague.skill, built for one job: when a teammate quits, capture their review standards and incident heuristics before the context walks out with them.

The five-person team at Shanghai AI Lab posted the technical report to arXiv on May 29, 2026. MIT Technology Review had already covered the trend in April, and the “distill them before they leave” framing is what drew both the stars and the pushback.

It builds on the Agent Skills standard, where a skill is a folder around a SKILL.md file plus optional scripts and references, loaded on demand. The repo ships three presets: colleague (the main one), celebrity, and relationship. A public gallery lists 215 skills from 165 contributors, which measures distribution, not whether any of them work.

Expertise rarely lives in a manual. It is scattered across design docs, review comments, chat decisions, and incident notes.

dot-skill reads those traces and writes a few plain Markdown files. The loadable one is SKILL.md. Behind it sit work.md (what the person knows) and persona.md (how they act).

Because it follows the Agent Skills format, any compatible host loads it: Claude Code, OpenClaw, Codex, or Hermes. And because the output is Markdown, you can read the extracted rules, fix them in plain English, version the result, and roll it back.

The paper calls the method person-grounded trace-to-skill distillation. A lightweight profile, a source scope, and a set of documents map to a package the paper writes as S = (A, M, L): the generated files, the install metadata, and the lifecycle state (version, correction count, rollback history). The package is meant to be portable, inspectable, composable, correctable, and governable.

The sharpest design choice is keeping capability separate from behavior. work.md holds review criteria, workflows, and decision heuristics. persona.md holds tone, interaction rules, and a correction log.

Those generate three entry points: the full skill, work-only, and persona-only. Work-only is the safer one, because a review checklist does not need a personality. In the paper’s example, a colleague skill encodes a review order: check authentication, input validation, rate limiting, response schema, and sensitive-data exposure before smaller issues.

Generation emits seven files on schema v3: SKILL.md, work.md, persona.md, the two sub-skills work_skill.md and persona_skill.md, plus manifest.json and meta.json for install and lifecycle metadata.

Corrections are plain language. Say “he would not push back there” and the handler routes it: a work correction patches the matching ## section in work.md, a behavior correction appends a {scene, wrong, correct} record to persona.md. Every update archives the prior version first, and version_manager.py rolls back to any of the last 10.

The paper makes one kind of claim: that this format and workflow exist and run. It does not claim the skill reproduces the person or improves anyone’s work. The authors name their own open problem the behavioral fidelity frontier, and the paper ships no held-out task study to close it.

Here is the short path. Every command, including install-to-deploy and rollback, is in the appendix at the end.

Clone the repo into your host’s skills directory, or hand the URL to your agent and let it install itself. Then run /dot-skill, pick the colleague family, and answer three questions: an alias, a one-line role, and a few personality tags.

Point it at authorized traces. It can auto-collect from Feishu, DingTalk, or Slack, or take uploads: PDFs, screenshots, .eml files, or pasted text. It then generates the files and, by default, installs the skill into Claude Code.

Read work.md before you trust it. Then invoke the full skill with /{character}-{slug}, or the safer work-only path with /{character}-{slug}-work. Full install-to-rollback commands are in the appendix below.

dot-skill ships real software: collectors, a writer, installers, rollback, and 35 passing tests. The gap is between what the product page sells and what the paper will actually claim.

No fidelity evidence.

The paper proves a file format, not that a generated skill catches what the real engineer would. It says so itself and ships no held-out evaluation. You are trusting extraction quality you cannot yet measure.

The persona layer can turn a label into a rule.

The colleague persona analyzer translates freeform tags like “blame-shifter” or “PUA” into Layer 0 rules the agent must never break, and manual tags outrank the actual traces. The repo’s own colleague demo shows the skill dodging blame on cue. That is bias compiled into behavior, by design.

Governance is an affordance, not a guarantee.

Local-first, versioned files give you control, but nothing enforces consent, retention, or redaction in code, and deletion is rm -rf. The product site mentions RAG, yet there is no retrieval runtime in the repo (requirements.txt is requests, pypinyin, playwright, slack-sdk, python-docx, and openpyxl).

So the best recommendation is to adopt the work-only path and treat the rest as a research preview. Package a departing engineer’s review checklist as a work.md skill, test it against reviews you already graded, and keep persona off until someone measures fidelity. Separate relationship and celebrity papers are promised, which is exactly where the consent questions get harder.

If a teammate left tomorrow, would you trust a work-only skill of their review checklist, or is judgment the part that never compiles?

All source links are in the first reply. Full breakdown of recent updates + daily signals in our newsletter (link in bio).

Shorter than the repo’s README. Eight steps, install to rollback.

1. Install dot-skill into your host.

Clone the repo into your host’s skills directory. For Claude Code that is ~/.claude/skills/dot-skill.

bash

git clone https://github.com/titanwings/colleague-skill ~/.claude/skills/dot-skill

OpenClaw uses ~/.openclaw/workspace/skills/dot-skill, Codex uses ~/.codex/skills/dot-skill. For Hermes, clone anywhere and run python3 tools/install_hermes_skill.py --force. Or skip all of this and tell your agent to install the skill at the repo URL.

2. Launch and pick a family.

Run /dot-skill and choose colleague. The other presets are relationship and celebrity.

3. Answer the intake.

Three questions: alias, a one-line role, and personality tags. Keep the tags factual. They become behavior rules, not flavor text.

4. Provide source material.

Auto-collect from a chat platform, or upload files.

bash

# Slack auto-collect (an admin installs the bot; free workspaces cap history at 90 days)
python3 tools/slack_auto_collector.py --setup
python3 tools/slack_auto_collector.py --name "Jane Doe"

Or upload PDFs, screenshots, .eml archives, or pasted text. You can also skip collection and generate from the intake alone.

5. Generate and inspect.

Generation runs through the writer and emits all seven files.

bash

python3 tools/skill_writer.py \
  --action create \
  --character colleague \
  --slug jane-doe \
  --name "Jane Doe" \
  --meta /tmp/meta.json \
  --work /tmp/work.md \
  --persona /tmp/persona.md \
  --base-dir ./skills/colleague \
  --no-install-claude-skill

By default the create step auto-installs into Claude Code. Pass --no-install-claude-skill to stop and read work.md and persona.md first. This is where you catch a manual tag that became a rule it should not have.

6. Install the generated skill to a host.

bash

python3 tools/install_claude_generated_skill.py --skill-dir skills/colleague/jane-doe --force

Use install_openclaw_generated_skill.py or install_codex_generated_skill.py for the other hosts. Then invoke the full skill with /colleague-jane-doe, or the work-only entry point with /colleague-jane-doe-work.

7. Correct it in plain English.

Tell the agent what is wrong. A work fix patches a ## section, a behavior fix becomes a {scene, wrong, correct} record.

bash

python3 tools/skill_writer.py \
  --action update \
  --character colleague \
  --slug jane-doe \
  --correction-json /tmp/correction.json \
  --base-dir ./skills/colleague

8. Version and roll back.

Every update archives the prior version. The version manager keeps the last 10, and cleanup is manual, so old versions stay until you remove them.

bash

python3 tools/version_manager.py --action rollback --character colleague --slug jane-doe --version 3 --base-dir ./skills/colleague

Deploy maps to wherever your agent reads skills: ~/.claude/skills/, ~/.codex/skills/, or the Hermes skill directory.