Building
CSHMD is a clinical-AI initiative I'm building alongside my clinical work. The aim is decision-support and second-opinion tooling for dermatology, with particular focus on conditions and skin tones that current public AI handles poorly — most clinical AI is trained on lighter-skin populations, and the resulting tools can fail visibly when applied to the patients I see daily.
The work is in active development. CSHMD is a working name; the initiative isn't yet incorporated.
The thesis
Most public dermatology AI is trained on academic datasets that under-represent Indian, South Asian, African, and other Fitzpatrick IV–VI skin tones. Models that work on lighter skin can fail visibly on darker skin — in classification accuracy and in the clinical recommendations downstream. For the patients I see daily, this gap is a real obstacle to using off-the-shelf AI in clinic.
Building from inside a clinical practice changes both what the system is good at and how its mistakes get caught. Most clinical-AI ventures are run either by engineers without clinical depth, or by doctors who've left practice to build full-time. Staying in practice means I see how tools fail on real cases, can adjust accordingly, and have ongoing context for what's worth building.
What exists today
The current pipeline is a layered measurement system for clinical dermatology imagery:
- DINOv3-based severity regressors for global condition grading.
- SAM2 segmentation paired with CIELAB colorimetry for diffuse patches like melasma and vitiligo, where area and colour change matter more than discrete lesions.
- RF-DETR detection with a DINOv3 backbone for counting discrete lesions like acne and post-inflammatory hyperpigmentation. The acne taxonomy expands to eight classes specifically to capture the acne-to-PIH transition that bothers Indian patients more than the underlying acne does.
In April 2026 I built AutoDerm at the OpenAI Codex Community Hackathon — a per-lesion acne detection system implementing Andrej Karpathy's autoresearch pattern, where Codex acts as an autonomous reasoning layer running training-evaluation loops without human-in-the-loop oversight. Across roughly 57 autoresearch iterations, the primary metric moved from 0.061 to 0.273 mAP50–95.
These are research-stage systems, not deployed products.
What's in progress
Three threads are active:
- Pipeline maturation. Moving the imaging models from research-stage toward clinical-evaluation-stage — broader condition coverage, validation against expert dermatologists, better robustness across skin types.
- Consultation tooling. Research on multi-language clinical conversation analysis. Early-stage.
- External pilots. Scoping a possible pilot with US aesthetic-dermatology chains for consultation tooling. The Indian context won't transfer directly — any pilot would need to retrain on US-specific data. Early-stage research.
What I'm looking for
Specifically:
- Conversations with clinicians building AI in adjacent specialties — pathology, ophthalmology, radiology. What worked, what didn't, what the regulatory path looked like.
- Operators of US aesthetic-dermatology chains considering AI pilots for consultation tooling.
- Researchers working on clinical AI for skin-of-colour populations. The gap in public datasets here is real; collaboration on benchmarks, validation, or shared evaluation frameworks would be useful.
- Investors focused on clinical AI or vertical AI in healthcare. Stage-aware — this is pre-seed by any conventional measure, with technical traction but no commercial validation yet.
The contact form on this site is the simplest way to reach me directly.
Recent
Two builds in the same week of April 2026, both centred on the same problem — autonomous training-evaluation loops for per-lesion acne detection. The first was at the OpenAI Codex Community Hackathon in Bengaluru, where I built AutoDerm as the implementation vehicle for the autoresearch pattern. The second was at the GrowthX OpenCode Buildathon — India's first OpenCode buildathon — where I extended the same line of work, iterating on the detection pipeline and the autonomous evaluation loop.