Research Loop

このページは、research を「調べてレポートを書くこと」としてではなく、 問いの定式化、仮説の分岐、証拠の獲得、反証、再焦点化、意思決定支援、暫定知の durable 化 を含む反復的な process loop として捉えるためのケーススタディである。

ここでいう research は academic research に限らない。実務上の

新しい技術を採用すべきか

ある問題の根本原因は何か

どの設計案が妥当か

どの risk を受け入れ、どの path を捨てるべきか

といった 探索的で、途中で問いの形自体が揺れうる仕事 全般を含む。

このページの狙いは、research loop を PCE 2.0 の中でも特に Goal Ownership、Execution、Evaluation、Branch and Join、Checkpoint and Recovery、Corrupt Success、Memory Writing が強く交差するケースとして示すことにある。

このケースの目的

このケースでは、次のことを明確にしたい。

research は implementation の前段ではなく、独立した process frame であること
research の output は「正解」だけでなく、仮説、反証、限定条件、次の実験提案、却下理由であること
research loop は linear に進まず、question freeze -> branch -> join -> reframe -> next loop を繰り返しうること
branch ごとの evidence を join するとき、単に一番良さそうな案を選ぶだけでは足りないこと
plausible だが evidence が薄い research conclusion は Corrupt Success になりうること
research から生まれる knowledge は、そのまま canonical にしてよいとは限らず、provisional と canonical を分ける必要があること
research frame は「採用する」と結論づけるときだけでなく、「採用しない」「いまは保留する」と結論づけるときも成功しうること

research loop は、PCE 2.0 を理解するうえで非常によいケースである。なぜなら、ここでは artifact outcome よりも次が中心になるからである。

question shaping
evidence sufficiency
counterevidence retention
hypothesis invalidation
decision admissibility
memory discipline

シナリオ

前の Feature Delivery、PR Review、Bug Triage の連続で、 checkout pricing まわりに複数の問題が見えてきたとする。

coupon-combination の feature delivery では review が複雑だった
PR review では stale approval と scope drift が問題になった
bug triage では negative total の triage が financial correctness に直結した

その結果、チームに次の research question が生まれる。

research question

checkout pricing 系変更に対して、「手書きの regression cases を増やし続ける」のではなく、 pricing invariants を declarative に定義し、変更ごとに invariant-based verification を走らせる仕組み を導入すべきか。

ただし、次が未確定である。

実際に今の failure class をどこまで捕捉できるのか

false positives が多すぎて review burden を増やさないか

既存の PR review / bug triage / release process とどう接続するのか

rollout / rollback / memory discipline をどう変えるのか

技術的には可能でも、運用コストに見合うのか

この research question がよいのは、次の特徴を持つからである。

実装タスクではない
しかし purely theoretical でもない
複数の仮説と branch が必要
一回の調査で閉じない可能性がある
「採用しない」ことも valid outcome である
process memory と decision memory の両方を生みうる

全体像

このケースの flow を短く書くと、次のようになる。

research question intake
  -> question freeze
  -> hypothesis set
  -> branch:
       internal incident / review evidence
       external technique scan
       bounded prototype
       adoption cost / governance analysis
  -> join
  -> synthesis
  -> if uncertainty remains:
       narrow question
       second loop
  -> recommendation:
       adopt
       run pilot
       defer
       reject
  -> optional memory promotion

ここで重要なのは、research loop が find-the-answer 型ではなく reduce-the-right-uncertainty-enough-to-route-the-next-decision 型だということである。

つまり research frame の成功条件は、世界の真理を完全に得ることではなく、 次の意思決定を unsafe でない形で可能にすること である。

なぜ research loop は PCE 2.0 的に重要なのか

research は一見すると “柔らかい仕事” に見える。しかし PCE 2.0 的には、むしろ process discipline が強く要る領域である。

1. question 自体が moving target だから

implementation や PR review では subject が比較的固定されている。一方 research では、調べていくうちに次が起こりやすい。

もとの問いが粗すぎた
実は別の問いを解くべきだった
scope が広すぎる
success criterion が曖昧だった

そのため Goal Ownership と reframe が特に重要になる。

2. plausible story が簡単に作れてしまうから

research は artifact correctness が見えにくい。だからこそ、次が混ざりやすい。

もっともらしい narrative
cherry-picked evidence
one-off prototype success
authority のない強い recommendation

これは Corrupt Success の典型領域である。

3. evidence の非対称性が大きいから

同じ問いに対しても、

internal incidents
prior PR review history
external papers / blogs / docs
prototype run results
cost model
human domain judgment

と、evidence modality が大きく違う。これを一つの “research result” に雑に押し潰すと危険である。

4. 失敗した仮説も重要な output だから

research では、仮説が間違っていたこと自体が価値になる。つまり productive failure が特に重要である。 PCE 2.0 はこれを typed delta や failure memory として扱える。

5. long-running になりやすいから

research は session をまたぎやすい。そのため checkpoint、recovery、stale invalidation が重要になる。

research frame

このケースの parent frame は、たとえば次のように置ける。

process_frame:
  frame_id: research.checkout.invariant-verification.adoption
  parent_frame_id: release.checkout.spring-2026
  frame_kind: research

  goal: >
    checkout pricing 変更に対して declarative invariant-based verification を導入すべきかを評価し、
    adopt / pilot / defer / reject のいずれが妥当かを決める。

  success_criteria:
    - research question が bounded に定義されている
    - competing hypotheses が explicit である
    - internal / external / prototype / adoption-risk の evidence が評価されている
    - recommendation に required evidence と counterevidence が添付されている
    - recommendation が human-backed decision に十分な形でまとめられている
    - provisional と canonical を混同しない

  scope:
    in_scope:
      - checkout pricing invariant verification adoption
      - historical failures and PR review evidence
      - bounded prototype and measurement
      - operational and governance implications
      - recommendation for next step
    out_of_scope:
      - full implementation rollout
      - release-wide migration execution
      - unrelated pricing redesign
    assumptions:
      - current pricing architecture remains baseline unless explicitly reframed
    non_goals:
      - proving global optimality
      - replacing all existing tests immediately

  actors:
    - research_lead
    - analyst_agent
    - internal_evidence_scanner
    - external_research_agent
    - prototype_executor
    - skeptic_reviewer
    - governance_reviewer
    - memory_writer
    - checkpoint_manager

  approval_points:
    - ap.research.checkout.invariant-verification.question-freeze
    - ap.research.checkout.invariant-verification.recommendation-acceptance

  eval_contract:
    - eval.research.evidence-quality.v1
    - eval.research.prototype-signal.v1
    - eval.research.recommendation-admissibility.v1
    - eval.research.memory-candidate.v1

  memory_write_policy:
    allowed_targets:
      - decisions
      - failure_memory
      - operational_memory
      - pending_candidates
    prohibited_targets:
      - raw_research_notes_as_canonical
      - unverified_hypotheses_as_decision_memory

  recovery_strategy:
    checkpoint_policy:
      - after_question_freeze
      - after_first_join
      - before_recommendation_approval
    rollback_anchor_policy:
      - last_accepted_question_state
    resume_conditions:
      - invalidate_stale_branch_returns
      - revalidate_current_architecture_assumptions
    escalation_path:
      - research_lead
      - product_owner

この frame で重要なのは、

goal が “実装する” ではなく “採用可否を決める” こと
success criterion に counterevidence が implicit に含まれていること
recommendation acceptance が explicit gate であること
memory write policy が provisional / canonical を分けていること

である。

actor と責任の非対称性

research は collaborative に見えるが、責任は非対称である。

research_lead

global goal ownership
question freeze authority
final recommendation package owner
escalation sink for scope ambiguity

analyst_agent

hypothesis generation
synthesis draft
branch design support
final acceptance は持たない

internal_evidence_scanner

incident / PR / bug triage history の探索
recurrence pattern 抽出
current project evidence gathering

external_research_agent

external technique / paper / blog / library scan
comparable systems survey
local fit judgment の最終責任は持たない

prototype_executor

bounded prototype 実行
benchmark / counterexample generation
adoption decision は持たない

skeptic_reviewer

counterevidence surfacing
narrative inflation の検出
“もっともらしいだけ” を壊す役

governance_reviewer

operational burden
approval / review topology への影響
memory / rollback / release implications の確認

memory_writer

research-derived memory candidates の durable write authority
recommendation acceptance 自体は持たない

ここで重要なのは、research の中でも

execution
evaluation
approval
memory mutation

が対称でないことだ。とくに skeptic_reviewer は research で重要である。賛成 evidence だけを集める研究は Corrupt Success になりやすいからである。

Step 0: question intake

research は、まず “問い” を受け取る。しかしその問いはたいてい粗い。

このケースでも最初は、

“invariant-based verification を導入した方がいいのでは？”

くらいの形で始まる。これでは research frame として弱い。なぜなら、次が曖昧だからである。

adopt するのか
pilot をするのか
current pain を減らせるか
false positive cost を許容できるか
which failure class を対象にしているか

そこで最初に必要なのは question freeze である。

Step 1: question freeze

research lead は question を次のように bounded にする。

checkout pricing 変更に対する invariant-based verification は、 current PR review + regression suite と比べて、 negative-total / precedence / rounding class の defect をより早く検出でき、かつ review burden を許容範囲内に保てるか を評価する。

これで少なくとも次が bounded になる。

target defect classes
comparison baseline
success notion
out-of-scope（全部の pricing redesign はしない）

なぜ question freeze が重要か

research でありがちな failure は、次である。

途中で問いが勝手に広がる
prototype が面白いので別の問いを解き始める
“技術的に可能か” と “採用すべきか” が混ざる

question freeze は、これを防ぐ最初の governance gate である。

approval point としての question freeze

このケースでは、ap.research...question-freeze があり、 research_lead が bounded question を accept してはじめて branch を切る。これにより、research loop 自体が drift しにくくなる。

Step 2: 仮説集合を explicit にする

question が凍結されたら、次に hypothesis set を作る。

仮説例

H1: invariant-based verification は target defect class を有意に早く捕捉できる
H2: current regression + review checklist を改善するだけで十分である
H3: full adoption は重すぎるが、financial correctness invariants に限定した pilot は有効である
H4: prototype は魅力的に見えても false positives と maintenance cost が高く、採用に値しない

ここで重要なのは、研究を “invariant verification を導入する理由集め” にしないことだ。反対仮説や否定仮説を explicit に並べる必要がある。

delta としての hypothesis set

これは Process Delta としては、たとえば次を含む。

decision-support delta research question freeze accepted
hypothesis delta
evaluation target delta
counterevidence requirement note

つまり research loop の早い段階ですでに、 output は artifact ではなく typed delta である。

Step 3: branch を切る

このケースでは、research_lead は四つの branch を切る。

branch A: internal evidence scan

目的:

internal incidents, PR reviews, bug triage, rollback notes を見る
target defect classes の recurrence と current pain を把握する

branch B: external scan

目的:

invariant-based verification の一般論、ツール、先行事例、失敗例を調べる

branch C: bounded prototype

目的:

very small prototype を current checkout rule set に対して当て、 detection quality / false positive / engineering cost signal を得る

branch D: adoption / governance analysis

目的:

review burden
CI / PR review integration
rollbackability
memory / maintenance / ownership cost
release process impact

これを YAML 風に書くと、たとえば次のようになる。

branch_set:
  set_id: bs.research.checkout.invariant-verification.v1
  parent_frame_ref: research.checkout.invariant-verification.adoption
  purpose: >
    internal evidence, external scan, bounded prototype, governance/adoption cost を parallel に進め、
    recommendation synthesis に必要な evidence を集める
  branch_kind: validation

  retained_authorities:
    - global_goal_ownership_by_research_lead
    - question_reframe_authority_by_research_lead
    - recommendation_acceptance_by_research_lead_and_product_owner
    - canonical_memory_write_by_memory_writer

  shared_constraints:
    - do_not_treat_prototype_as_production_proof
    - preserve_counterevidence
    - no_canonical_decision_memory_write_from_unapproved_synthesis

  branch_specs:
    - child_frame_ref: research.checkout.invariant-verification.internal-evidence
      local_goal: >
        historical incidents / reviews / bug triage から target defect class の recurrence を抽出する
      return_contract:
        required_deltas:
          - internal_failure_patterns
          - current_pain_summary
          - contradictory_cases_if_any

    - child_frame_ref: research.checkout.invariant-verification.external-scan
      local_goal: >
        invariant-based verification の techniques, precedents, limitations を整理する
      return_contract:
        required_deltas:
          - external_technique_summary
          - adoption_preconditions
          - failure_modes

    - child_frame_ref: research.checkout.invariant-verification.prototype
      local_goal: >
        bounded prototype を current checkout rules に当て、 signal を取る
      return_contract:
        required_deltas:
          - prototype_result
          - false_positive_note
          - engineering_effort_note

    - child_frame_ref: research.checkout.invariant-verification.governance
      local_goal: >
        PR review, release, rollback, memory, ownership への operational impact を整理する
      return_contract:
        required_deltas:
          - governance_impact_note
          - rollout_integration_note
          - ownership_cost_note

  join_contract:
    join_kind: all_of
    required_branches:
      - research.checkout.invariant-verification.internal-evidence
      - research.checkout.invariant-verification.external-scan
      - research.checkout.invariant-verification.prototype
      - research.checkout.invariant-verification.governance
    readiness_rule:
      - all_returns_present_or_explicit_insufficient
    conflict_resolution:
      on_prototype_success_but_governance_failure: keep_conflict_visible
      on_external_precedent_vs_internal_fit_conflict: escalate_to_research_lead
    integration_owner: research_lead

ここで重要なのは、branch の目的が “材料を集める” だけでなく、 later synthesis に必要な contrast を作ること である。 internal と external、prototype と governance が tension を持つからこそ、join が meaningful になる。

Step 4A: internal evidence branch

この branch は、過去の internal process を掘る。

source

bug triage records
PR review notes
rollback notes
failure memory
incident records
regression history

返り値の例

negative-total / rounding / precedence class は、過去 3 か月で 4 件あった
PR review だけでは見抜けなかったケースが 2 件ある
ただし defect classes は完全に同型ではない
current pain is real, but target class is narrower than “all pricing bugs”

重要な点

この branch は “導入した方がいい理由” だけでなく “実は target class が narrow で adoption cost に見合わないかもしれない” という counterevidence も返すべきである。

それがない internal evidence branch は、research loop の corruption source になる。

Step 4B: external scan branch

この branch は external docs、papers、tooling、precedents を見に行く。

返り値の例

invariant/property-based verification は pricing-like systems で有効な precedent がある
ただし domain-specific invariant design cost が高い
false positive control と ownership design が weak だと abandoned されやすい
“works in demos” と “stays useful in review flow” は別問題

重要な点

external scan は authority を持たない。外部の “best practice” は current organization への adoption authority を与えない。 PCE 2.0 的には、これは evidence ではあるが decision ではない。

Step 4C: bounded prototype branch

この branch は小さな prototype を実行する。ここが research で最も seductive になりやすい場所である。

prototype の範囲

full production rollout はしない
target defects は narrow
one small rule-set にだけ当てる
false positives を必ず取る
maintainability note を残す

返り値の例

target class の 3/4 を prototype は検出した
one known good case でも false positive が 2 件出た
rule authoring cost is non-trivial
prototype success is promising but not adoption-ready

重要な点

prototype success は research における最大の corrupt-success source の一つである。 “動いた” を “採用してよい” と読み替えてはいけない。 prototype branch は常に次を返すべきである。

detection signal
failure signal
cost signal
boundedness note

Step 4D: governance / adoption cost branch

この branch は、技術そのものより “process として持つのか” を見に行く。

見ること

PR review への integration cost
CI latency
ownership: 誰が invariant を保守するのか
rollback strategy
false positives when incident pressure is high
memory implications: checklist や failure memory をどう更新するか

返り値の例

full adoption は current review burden を上げすぎる
financial correctness invariants に限定すれば integration 可能
invariant catalog は pricing-domain owner、runner maintenance は developer productivity owner が持つべきである
rollback は required check の解除と pilot rule set の無効化を一手でできる形にしておくべきである
pilot 中の false positive threshold は「pricing PR の 5% 超で誤ブロックしない」「median review overhead を 10 分超増やさない」を上限に置く
rollout は pilot-first が安全

この branch が重要なのは、 research result を「技術的にできるか」から「process として成立するか」へ引き戻す役だからである。

Step 5: first join

四つの branch が返ったら、research_lead が join する。

join で見えること

internal pain は real
external precedent is encouraging
prototype is promising but noisy
full adoption cost is high
limited pilot on financial invariants looks plausible

この時点で起きること

initial question “導入すべきか” は、そのままでは粗すぎると分かる。

より正確には、

full adoption の可否ではなく、 financial correctness invariants に限定した pilot を走らせるべきか

に reframe した方がよい。

ここで research loop は一回で終わらない。 first join は often answer ではなく question narrowing を返す。

これが research loop の重要点である。

Step 6: reframe と second loop

research_lead は question を narrow する。

old question

invariant-based verification を採用すべきか

new question

financial correctness invariants に限定した pilot を、next checkout release で bounded に試すべきか

何が変わるか

hypothesis set が narrower になる
governance branch の importance がさらに上がる
prototype branch は adoption proof ではなく pilot design branch になる
recommendation target は adopt/reject から pilot/defer に変わる

ここで重要なのは、 research loop の途中で question を変えることは failure ではない、という点である。ただしそれは silent drift であってはならず、 explicit reframe として行われるべきである。

Step 7: recommendation package

second loop の終わりで、research_lead は recommendation package を作る。

recommendation の形

このケースでは、たとえば次のようになる。

recommended next step

full adoption は現時点では推奨しない。ただし、financial correctness invariants に限定した bounded pilot は推奨する。その条件は次のとおり。

target defect class is explicitly limited

reviewer burden budget is measured

false positive threshold is capped at <= 5% false blocks and <= 10 min median review overhead increase

invariant catalog ownership is held by pricing-domain owner, runner maintenance by developer productivity owner

rollback path for disabling pilot is a one-step operation at the required-check / rule-set level

pilot outputs are kept provisional until post-pilot evaluation

この package には、少なくとも次が含まれるべきである。

supporting evidence
counterevidence
open risks
why not full adoption
why pilot is still admissible
what would invalidate this recommendation

pilot design checklist

pilot scope は financial correctness invariants に限定されているか
alert の扱いは hard fail ではなく bounded enforcement になっているか
false positive threshold と review burden budget が事前に固定されているか
invariant catalog owner と runner maintenance owner が分離されず明確に置かれているか
required check の解除または pilot rule set の無効化で即 rollback できるか
pilot の出力が canonical decision memory に直書きされず、post-pilot evaluation まで provisional に留まるか

ここで重要なのは、research recommendation が one-way sales memo ではなく、admissibility package であることだ。

research における Corrupt Success

research loop で最も危険なのは、 “とても賢そうな recommendation” ができてしまうことだ。

このケースでの典型 corrupt success

prototype branch が flashy な demo を出し、 external scan でも precedent がいくつか見つかり、 research memo がとても convincing に見える。

しかし実際には、

internal defect class は narrow
governance cost branch は full adoption を否定している
false positive threshold が固定されていない
maintenance ownership が固定されていない
rollout / rollback path が one-step disable になっていない

にもかかわらず、

“invariant verification は有効だった。導入すべき”

と結論づける。

これは surface outcome としては美しい。だが process 的には壊れている。

counterevidence suppression
prototype inflation
question drift
governance omission
adoption readiness overclaim

である。 PCE 2.0 では、これは research-corrupt success とみなされる。

対応

recommendation package を invalidate
counterevidence を active に戻す
recommendation acceptance gate を reopen
if needed, reframe or rerun governance branch

research では “よくまとまったレポート” が success に見えやすい。だからこそ Corrupt Success の概念が特に必要になる。

research の checkpoint と recovery

research は long-running であり、 loop 間の checkpoint が重要になる。

checkpoint するべき boundary

question freeze 完了
hypothesis set 固定
first join 完了
reframe 後の narrowed question 固定
recommendation package ready

recovery point に必要なもの

current question
current hypothesis set
branch return refs
unresolved conflicts
stale branch markers
next required loop step
acceptance gate status

recover 時の注意

research では stale evidence が特に危険である。たとえば、

external scan が古くなった
internal architecture assumption が変わった
release constraints が変わった
candidate pilot scope が別 frame に吸収された

場合、old synthesis は current truth として使えない。

つまり research recovery は、単なる “前回のメモの続き” ではなく question / evidence / assumptions の再バインド を必要とする。

research の output は Process Delta である

research loop の output は artifact でなくともよい。むしろ多くは Process Delta として理解した方がよい。

代表的な delta

1. hypothesis delta

H1/H2/H3/H4 の定義
narrowed question after reframe

2. evidence delta

internal recurrence note
prototype result
external precedent summary
governance burden summary

3. synthesis delta

integrated recommendation candidate
uncertainty register
decision dependency map

4. failure / rejection delta

rejected hypothesis
why full adoption is not admissible yet
prototype overclaim warning

5. operational memory candidate

research review checklist
pilot design checklist
counterevidence register template

6. decision delta

recommend pilot, not full adoption

これにより research は、 “結論に至る前の柔らかい議論” ではなく、 later eval / approval / memory promotion に耐える typed outputs を持つ。

research から生まれる durable learning

このケースでは、少なくとも次の memory candidates が生まれうる。

1. failure memory

flashy prototype success should not be treated as adoption readiness without governance fit

これは research process の anti-pattern として価値がある。

2. operational memory candidate

adoption research for high-risk correctness tooling should branch into: internal evidence / external precedent / bounded prototype / governance cost

これは re-usable research playbook candidate になりうる。

3. decision memory

full invariant-verification adoption is not yet justified; pilot-first is the accepted path for this domain

ただしこれを canonical decision memory にするには、 human-backed acceptance が必要である。

4. evaluation memory

prototype success alone is insufficient; research recommendation requires counterevidence section and adoption cost section

これは future research loops の eval contract 改善に効く。

ここでも重要なのは、 research note 全部を memory にするのではなく、 future process に効く bounded knowledge だけを抽出することだ。

このケースで見るべき process metrics

research loop では、artifact metrics より process metrics が重要である。少なくとも次が有効である。

Hypothesis Invalidation Yield

立てた仮説のうち、明確に棄却・修正されたものの割合。高ければよいとは限らないが、zero だと confirmation-biased research の可能性がある。

Evidence Sufficiency Rate

recommendation package が最初の提出時点で十分な evidence と counterevidence を持っていた割合。

Branch Return Completeness

internal / external / prototype / governance branch の return contract がどれだけ満たされたか。

Time to First Discriminating Evidence

“それっぽい情報” ではなく、仮説間の差を生む evidence が出るまでの時間。

Reframe Rate

question freeze 後に何回 reframe したか。高すぎると framing が粗いかもしれないが、zero でも open-ended research では不自然なことがある。

Corrupt Success Rate

plausible recommendation が later に invalidated / reopened された割合。

Research Memory Promotion Precision

research-derived operational memory や decision memory が後続の research / adoption decisions で本当に使われた割合。

このケースが示していること

この case を通して見える PCE 2.0 の主張は、少なくとも次のとおりである。

1. research は implementation 前の曖昧な時間ではない

それ自体が bounded な process frame である。

2. research の成功は “採用する” ことではない

“採用しない”“pilot に留める”“いまは defer する” も十分に valid な successful outcome である。

3. 仮説は output であり、反証も output である

research は正解探しだけでなく、誤った path を切り落とす仕事でもある。

4. branch and join が research に本質的である

internal / external / prototype / governance を parallel に走らせないと、 adoption decision は偏りやすい。

5. compelling narrative は危険である

plausible synthesis はすぐに Corrupt Success になりうる。 counterevidence retention が essential である。

6. research でも durable learning は生まれる

ただし provisional と canonical を分け、write authority を明示しなければならない。

7. human oversight は最後に残る

AI や tool が research execution を大きく担えても、 adoption / pilot / risk acceptance の最終 gate は human-backed authority が持つことが多い。

このケースの最終要約

この research loop case を一文でまとめると、次のようになる。

PCE 2.0 における research とは、問いを受けて何となく調べることではなく、問いを bounded に freeze し、 competing hypotheses を明示し、 internal / external / prototype / governance の branch を走らせ、 join によって反証と支持の両方を統合し、必要なら question 自体を reframe しながら、 adopt / pilot / defer / reject のどれが current evidence のもとで admissible かを決める process frame である。

この意味で research loop は、単なる調査フェーズではない。それは、PCE 2.0 において 不確実性を安全に減らし、次の意思決定を governed に可能にする探索 topology である。