Hackathon Judging Criteria

The scoring framework, rubric, and scorecard Opportunity Hack uses across every hackathon — free to copy and adapt for your own event

A hackathon's outcome rests on its judging. Bad criteria reward the showiest demos and ignore the hardest work; good criteria identify the projects that will actually ship. After running social-good hackathons since 2013 — with hundreds of judges across thousands of projects — Opportunity Hack has converged on a four-category rubric that has held up across both beginner and senior teams, in-person and remote, weekend and week-long events.

This page is the authoritative reference: the framework, the scoring scale, a worked example, the downloadable rubric, and the common judging pitfalls we've watched people stumble into. If you're judging an Opportunity Hack event — bookmark this. If you're running your own hackathon — copy the rubric below. We open-source the model.

Apply to Judge

The Four-Category Framework

Each project receives one score per category on a 1-5 scale. Categories are weighted equally and combined for a total out of 40. The categories were chosen because they map to the qualities that distinguish hackathon prototypes that ship vs. those that don't — independent of the technology stack, problem domain, or seniority of the team.

Scope (10 points)

Scope answers two questions: how many people benefit from this solution, and how hard was the underlying problem to solve? A volunteer-scheduling tool that serves one nonprofit's 50 volunteers is narrower scope than a case-management platform serving 30 nonprofits across a region. A trivial CRUD app is shallower complexity than a project that handles real data normalization or domain edge cases.

What to look for:

Number of nonprofits or end-users impacted
Complexity of the problem solved relative to existing solutions
Whether the team chose a real problem or a toy problem

Documentation (10 points)

Documentation is the proxy for sustainability. The most beautiful hackathon prototype is worthless to a nonprofit that can't deploy or maintain it. Look at the README, the inline comments, the deployment instructions, the user-facing UX. Could a non-author run this project six months from now?

What to look for:

README explains setup, deploy, environment variables clearly
Code comments where logic isn't self-evident
User-facing UX is intuitive without a manual
Architecture decisions are recorded somewhere

Polish (10 points)

Polish is "how much work remains before this can be used today?" A demoable prototype with three TODO bugs and no error handling scores low. A working app deployed to a public URL with reasonable error states scores high. Polish is not about visual design — it's about production readiness.

What to look for:

Project is deployed somewhere public (not just running on the demo laptop)
Error states are handled gracefully
Common edge cases (empty state, slow network, invalid input) don't break the app
Estimated work remaining to reach MVP is small

Security (10 points)

Security is asymmetric: a vulnerability turns a useful app into a liability. Especially for nonprofits handling client data, donor records, or volunteer info, security failures can end a project. Score how the team thought about data protection and access control — even if the implementation isn't bulletproof, did they engage with the questions?

What to look for:

Sensitive data is gated by authentication
Role-based access exists where appropriate (admin vs. public users)
Secrets aren't committed to the repo
Input is validated; obvious injection vectors are closed

The Scoring Scale

Score	Description
1	Poor — Significantly below expectations
2	Fair — Below expectations
3	Good — Meets expectations
4	Very Good — Exceeds expectations
5	Excellent — Significantly exceeds expectations

Most projects score 2-4 across categories. A score of 5 should be reserved for genuinely exceptional work — a project where you'd want to recommend the team to your own employer. A score of 1 should be reserved for projects that fundamentally don't address the category (no documentation at all, no security thinking at all). The middle is where most evaluation happens.

Worked Example: Scoring a Volunteer-Scheduling App

Project: a volunteer shift-scheduling app for a homeless-services nonprofit

Here's how an experienced judge would score a real Opportunity Hack project — a volunteer shift-scheduling tool built for a homeless-services nonprofit during a weekend hackathon. The team shipped a working app deployed to Vercel, with a basic README and authentication via Auth0.

Scope — 4/5

Serves one nonprofit's ~80 weekly volunteers across 4 program sites. Solves a real problem (the nonprofit was using paper sign-up sheets and Google Sheets). Scope is moderate — single nonprofit, but the problem is operationally critical and replaces a manual process consuming staff hours every week. Not the broadest possible impact, but real.

Documentation — 3/5

README covers setup and deploy clearly. No inline comments in the trickier scheduling-conflict logic. UX is intuitive for the volunteer-facing pages but the admin pages have unlabeled buttons. A future maintainer would understand the surface but might struggle with the conflict-detection code.

Polish — 4/5

Deployed publicly. Empty states render correctly. Form validation catches obvious errors. One known issue: timezone handling when shift spans midnight — a moderate edge case. Overall closer to MVP than prototype, but not 5/5 production-ready.

Security — 3/5

Auth0 handles authentication well. Role-based access (admin vs. volunteer) is implemented. Secrets are in environment variables, not committed. But: no rate limiting on the sign-up endpoint, and the admin role check happens client-side only — the API would let a determined volunteer escalate. Solid foundations with one important gap.

Total: 14/20 (or 28/40 on the full rubric across both judges' scores). A solid project worthy of a category prize but not the grand prize.

The Scorecard Template

Copy and paste this template for your own hackathon. We open-source the framework — no attribution required, but we'd love to hear if you adapt it. Email greg@ohack.org with feedback.

HACKATHON JUDGING SCORECARD — OPPORTUNITY HACK FRAMEWORK
=========================================================

Project: ________________________________________________
Team:    ________________________________________________
Judge:   ________________________________________________
Date:    ________________________________________________

CATEGORY 1: SCOPE (10 points)
Sub-criterion A — Impact on community:    ___ / 5
Sub-criterion B — Complexity of problem:  ___ / 5
                              SCOPE TOTAL: ___ / 10
Notes:

CATEGORY 2: DOCUMENTATION (10 points)
Sub-criterion A — Code & UX docs:         ___ / 5
Sub-criterion B — Ease of understanding:  ___ / 5
                       DOCUMENTATION TOTAL: ___ / 10
Notes:

CATEGORY 3: POLISH (10 points)
Sub-criterion A — Work remaining for MVP: ___ / 5
Sub-criterion B — Can use today:          ___ / 5
                              POLISH TOTAL: ___ / 10
Notes:

CATEGORY 4: SECURITY (10 points)
Sub-criterion A — Data protection:        ___ / 5
Sub-criterion B — Role-based access:      ___ / 5
                            SECURITY TOTAL: ___ / 10
Notes:

OVERALL TOTAL: ___ / 40
Recommendation: [ ] Grand Prize  [ ] Category Prize  [ ] Honorable Mention  [ ] No Award

Try the Interactive Scorecard

Common Judging Pitfalls

After hundreds of hackathons we've seen the same judging mistakes repeat. Avoiding these doesn't make you a great judge — it just keeps you from being a bad one.

The team with the slickest pitch isn't always the team that built the most. Polished slide decks and confident speakers can hide thin implementations. Spend at least as much time looking at the code repo as you do at the demo. If you can't see code, ask why.

Whatever score you give the first team you evaluate sets your internal scale for the rest. Score the first project, then deliberately wait until you've watched 2-3 more before finalizing — and revise the first one if it doesn't fit the spread you've calibrated.

A team that says "we didn't get to feature X — here's our mitigation plan" is acting like a real engineering team. A team that hides incomplete work and hopes you don't notice is not. Reward honesty about gaps; penalize concealment.

Visual design is a tiebreaker, not a primary criterion. A clean utility app that solves the nonprofit's problem reliably scores higher than a beautiful app that doesn't. Polish is about production readiness, not aesthetics.

Hackathon judges sometimes evaluate technical impressiveness without asking "would the nonprofit actually use this?" If a project requires the nonprofit to hire a dev to maintain it, or learn a new tool to operate it, that's a scope/polish problem — not a strength.

Frequently Asked Questions

Yes — copy and adapt freely. We open-source the Opportunity Hack judging framework and explicitly invite other hackathon organizers to use it. No attribution is required, though we'd love to hear how you adapt it. Email greg@ohack.org with feedback.

A reasonable judge spends 12-20 minutes per project: 4 minutes watching the pitch video, 5-8 minutes reading code and the README, and 3-8 minutes for final scoring and notes. Live demo Q&A adds another 5-10 minutes for finalist projects. A judge handling 8-10 projects can expect a 3-4 hour total commitment.

Score what's present. A pure-data-pipeline project will score lower on Polish (no UI to evaluate) than a frontend-heavy project, but higher on Documentation and Security in many cases. Don't penalize projects for the categories they don't naturally exercise — let the average reflect the project's actual shape.

Multiple judges score each project, and the average is used. If two projects tie within 1 point, the head judge looks at sub-criterion scores and tiebreaker categories — typically Scope (impact) and Polish (production readiness) win out over Documentation in close calls. Ultimately the head judge has discretion.

At Opportunity Hack, social impact is captured inside Scope's sub-criterion A ("impact on community"). We deliberately don't have a separate "social impact" category — every project at OHack already starts from a real nonprofit problem statement, so impact is the floor not a differentiator. For non-social-good hackathons, you'd swap in a different sub-criterion under Scope.

Less than you'd think. Scope is independent of seniority — a beginner team that picks a focused problem and ships something narrow can match or beat a senior team that overscoped. Polish and Documentation reward effort and discipline, not raw skill. Security is the one category where senior teams often outscore beginners, but that's been more than offset historically by Scope and Documentation in our data.

The four-category framework generalizes well — Scope, Documentation, Polish, Security mostly map to "did they pick a real problem", "is the work understandable", "is the work usable today", and "is the work safe". For hardware, biotech, or design-focused hackathons, Security can become "safety/ethics" and the sub-criteria adjust accordingly.

Disagreement is normal and useful. Multi-judge scoring is designed to surface different perspectives. If two judges differ by 2+ points on the same project, the head judge brings them together briefly to discuss before final scores are locked. The goal isn't consensus — it's making sure both judges saw the same project. Keep your score if you still believe it after the discussion.

Ready to Judge an Opportunity Hack Event?

Apply to judge an upcoming Opportunity Hack hackathon, or browse our other hackathon resources.

Apply to Judge Become a Mentor Read About Coding for Nonprofits

Our Mission

To empower students, professionals, and nonprofits to collaboratively create sustainable tech solutions that drive social impact and foster learning.

Our Vision

To build a global community where individuals can accelerate their career growth while making a lasting impact for nonprofits.

#socialgood #nonprofit #volunteer #socialimpact #communitybuilding #technicalsolutions #skillsbasedvolunteering #makeadifference #givingback #changemakers #empowerment

Opportunity Hack Inc. EIN: 84-5113049

What's Opportunity Hack?
Sponsor Social Good
FAQ
Our History
Join Us on Slack
Privacy Policy
This website is a public good.
Fork this on GitHub

695dbaf

Hackathon Judging Criteria

The scoring framework, rubric, and scorecard Opportunity Hack uses across every hackathon — free to copy and adapt for your own event

The Four-Category Framework

Scope (10 points)

What to look for:

Documentation (10 points)

What to look for:

Polish (10 points)

What to look for:

Security (10 points)

What to look for:

The Scoring Scale

Worked Example: Scoring a Volunteer-Scheduling App

Project: a volunteer shift-scheduling app for a homeless-services nonprofit

Scope — 4/5

Documentation — 3/5

Polish — 4/5

Security — 3/5

The Scorecard Template

Common Judging Pitfalls

Scoring on demo flash, not on what shipped

Anchoring on the first project you score

Punishing teams for incomplete features they were honest about

"Beautiful" vs. "good"

Forgetting that the nonprofit is the customer

Frequently Asked Questions

Can I use this rubric for my own hackathon?

How long does scoring one project take with this rubric?

What if a project doesn't fit cleanly into one of the categories?

How are ties broken?

Should social impact be part of the criteria?

Is this rubric biased against beginner teams?

Does the rubric work for non-software hackathons?

What if I disagree with another judge's score?

Ready to Judge an Opportunity Hack Event?