FAQ

If the PDF is messy, let your AI clean it before you bring it into SynapQ

Use this flow for scanned, OCR-heavy, or visually noisy files. Give the PDF and the SynapQ prompt to your AI, ask for clean Markdown or text, then upload or paste the result back into SynapQ.

Plain-text mirror for crawlers and browser-enabled models: https://www.synapq.app/docs/ai-prep.txt

Hard PDF flow

Step-by-step: clean the file with AI first

This is the full path. If the file is messy, do these four steps and then bring the cleaned result back into SynapQ.

PDFLLM + SynapQ promptMarkdown / TextUpload back
1

Give the PDF to your LLM

Open the PDF in Claude, ChatGPT, Gemini, or another strong multimodal model.

2

Paste the SynapQ prompt

Use the SynapQ prompt so the model rewrites the file into parser-friendly Markdown.

3

Ask for clean Markdown or text

Tell the model to return only the cleaned study sheet, not commentary or JSON.

4

Bring it back into SynapQ

Upload the `.md` or `.txt` file, or paste the result directly into the New Material flow.

End result: clean Markdown or plain text that SynapQ can parse without guessing.

FAQ

The shortest answers to the questions users usually ask before trying this flow.

When should I take the file to an AI model first?

Use the AI-prep flow when the PDF is scanned, visually noisy, OCR-heavy, or when SynapQ is likely to miss questions or answer keys on the first pass.

  • Scanned PDFs with broken OCR, double columns, and noisy headers.
  • Image-heavy sources where the normal parse path would need visual enrichment.
  • Legacy materials that are readable to a strong LLM but expensive to normalize on SynapQ servers.
What exactly should I ask the model to return?

Ask for clean Markdown or plain text that follows SynapQ's question contract. The model should return the study sheet itself, not commentary or JSON.

  • Start each item with `## Question N`.
  • Keep choices on separate `A. / B. / C.` lines.
  • Include `Visual dependency: yes|no`.
  • Only add `Answer: X` when the answer is actually recoverable from the PDF.
What do I upload back into SynapQ?

Bring back the cleaned output as pasted text, `.md`, or `.txt`. SynapQ accepts the same parser-friendly structure in all three forms.

  • Review the output quickly for dropped questions or malformed choices.
  • Keep any `Extracted Images` appendix intact if the model generated one.
  • If the output is long, prefer downloading a single `.md` file instead of copying a huge chat response.
Can I use NotebookLM links directly?

Not right now. Treat NotebookLM as a place to generate content, then bring the output back into SynapQ by copy/paste or file export.

  • SynapQ does not import NotebookLM links directly.
  • Copy the generated quiz back into SynapQ or export it to `.md` / `.txt` first.
  • If the output is still messy, run it through the SynapQ AI-prep prompt before uploading.
What if the PDF only shows the correct answer text and no options?

That is a narrow special case. The model may synthesize distractors only for clear single-best-answer questions, and those items should come back marked for review.

  • Keep the stem unchanged.
  • Keep the original correct answer text unchanged.
  • Add exactly 3 plausible distractors and mark `Needs review: yes`.
What about images, tables, and visually marked answers?

The model should inspect the PDF visually, not just run OCR. If a question depends on an image or a visually marked correct choice, the output should preserve that fact instead of guessing.

  • Treat the PDF as a visual document, not only text OCR.
  • Prefer answer-key pages when they conflict with inline visual markings.
  • If the image is essential but unclear, mark `Visual dependency: yes` and `Needs review: yes`.

Use it when

  • Scanned PDFs with broken OCR, double columns, and noisy headers.
  • Image-heavy sources where the normal parse path would need visual enrichment.
  • Legacy materials that are readable to a strong LLM but expensive to normalize on SynapQ servers.

Bring this back into SynapQ

  • Review the output quickly for dropped questions or malformed choices.
  • Keep any `Extracted Images` appendix intact if the model generated one.
  • If the output is long, prefer downloading a single `.md` file instead of copying a huge chat response.

NotebookLM and similar tools

  • SynapQ does not currently import NotebookLM links directly.
  • Copy the generated quiz back into SynapQ or export it to `.md` / `.txt` first.
  • If the output is still messy, run it through the SynapQ AI-prep prompt before uploading.

Prompt to paste into your LLM

This block is intentionally standalone. If the model cannot browse, it still has the exact formatting contract needed for SynapQ's fast parser.

Reload guide
A PDF document will be attached. Convert it into clean, parser-friendly Markdown for SynapQ.

If you can browse the web, first read these detailed instructions:
https://www.synapq.app/docs/ai-prep

If you cannot access that page, continue using the rules below without stopping.

Goal:
- Preserve the original question order.
- Preserve exact medical, scientific, and exam terminology.
- Produce output that can be uploaded into SynapQ as Markdown, plain text, pasted text, or a converted PDF.

Answer recovery:
- Treat the PDF as a visual document, not as text-only OCR. Use multimodal inspection when available.
- Inspect the entire PDF for answer signals, not just the question pages.
- There may be no separate answer key. In some booklets, the correct option is marked directly on the question page.
- Some "çıkmış" or recall sheets contain only the correct answer text under the question, with no distractor options.
- If a separate answer key exists, it may appear at the end of the document. Match those entries back to the correct question numbers and add: Answer: X
- If the correct answer is visually marked in the question itself, recover it when possible. Use cues such as bold text, different text color, underline, highlight, check marks, filled circles, or other explicit formatting differences.
- Markings can distort OCR. If a choice label or word looks corrupted because of a mark, stamp, or highlight, interpret that anomaly as a possible answer signal instead of discarding it immediately.
- If both an inline visual cue and a final answer key exist, prefer the explicit answer key when they conflict.
- If the PDF clearly gives only one correct answer for a single-best-answer question and no alternative options, keep the stem unchanged, keep that original correct answer text unchanged, and synthesize exactly 3 plausible distractors from the stem context.
- Only synthesize distractors for standard single-best-answer questions. Do not synthesize distractors for matching, ordering, true/false, multi-statement, multi-answer, or open-ended questions.
- Only add Answer: X when the answer is recoverable from the PDF with reasonable confidence. If the answer is unclear, omit Answer: X and add Needs review: yes.

Output format:
- Output only Markdown. Do not include commentary, JSON, XML, HTML, or code fences.
- If your interface can actually generate a downloadable file and the result would be long, prefer returning a single .md file containing only the final cleaned document.
- Treat 40 or more questions, or any clearly long output, as a strong reason to prefer a file artifact when the interface supports it.
- If the interface does not actually produce downloadable files, return only the raw final Markdown document in the chat response without explaining that limitation.
- Keep exactly the same SynapQ format in either case. Switching from chat output to a .md or .txt file must not change the `## Question N` / choices / metadata structure.
- Do not add any preface, summary, explanation, confidence note, or closing text before or after the Markdown.
- Start each question with: ## Question N
- Write the full question stem directly below the heading.
- Then write: Visual dependency: yes or Visual dependency: no
- Put each choice on its own line as:
  A. ...
  B. ...
  C. ...
  D. ...
  E. ...
- Preserve the real number of choices from the source unless the source gives only one correct-answer text for a single-best-answer question. In that special case, output exactly 4 choices total: the original correct answer plus 3 synthesized distractors.
- When distractors are synthesized, do not rewrite the stem and do not rewrite the original correct answer text. Only add the 3 new distractors around it.
- If an answer key is present, add: Answer: X
- If an explanation is present, add: Explanation: ...
- If distractors are synthesized, add: Needs review: yes
- If any important part is unreadable, ambiguous, truncated, or visually dependent in a way that is not recoverable, add: Needs review: yes
- If you can faithfully extract one question-specific image, do not place it inline inside the question block. Append a final section exactly titled:
  ## Extracted Images
- If you are returning chat output, prefer precise crop coordinates instead of base64 in exactly this format:
  Question N: page=12 x0=120 y0=340 x1=820 y1=1260
- If you are returning a downloadable file artifact and can faithfully include the image bytes, you may use:
  Question N: data:image/png;base64,...
- Every image entry must stay on one physical line.
- Do not use bold, bullets, citations, labels in parentheses, or commentary in the image appendix.
- Do not put the data URL or coordinates on the next line.
- Do not write anything after the last image line.
- Use at most one extracted image per question, and only when it clearly belongs to that specific question number.
- If the image cannot be extracted faithfully, skip the appendix entry and rely on Visual dependency: yes / Needs review: yes instead.

Cleanup rules:
- Remove page numbers, headers, footers, watermarks, repeated titles, and duplicated scan noise.
- Reconstruct broken line wraps so the question reads naturally.
- Keep separate questions separate. Do not merge or split them incorrectly.
- Do not translate the content.
- Do not simplify terminology.
- Do not guess missing words, figures, tables, or answer choices, except for the explicit single-answer-only distractor synthesis rule above.
- If a table or figure is essential and readable, summarize only the minimum needed to understand the question.
- If a figure, table, image, or scan region is essential but unclear, mark Visual dependency: yes and Needs review: yes.

Return the final Markdown only.

Important: tell the model to recover answers from the PDF itself

For answer-bearing PDFs, the model should not stop at question text extraction. It should inspect final answer-key pages, inline formatting cues, and explicit visual markings that indicate the correct option, then emit `Answer: X` only when that mapping is actually recoverable from the source.

Also tell the model that some PDFs have no separate answer key at all, and that the correct option may only be visible through highlight, boldness, darker text, circles, ticks, or OCR-looking distortions caused by those markings.

If the cleaned output becomes long, the model should prefer returning a single `.md` file only if the interface can actually generate a downloadable file. If it cannot, SynapQ still accepts the same prepared-text contract from pasted text, uploaded `.md`, and uploaded `.txt` files.

If the model can faithfully extract a question image, it should append those visuals in a final `Extracted Images` appendix keyed by question number instead of placing raw image payloads inside the question body.

More detailed rules and examples

Target output shape

  • If the interface can actually generate downloadable files, return a single `.md` file containing only the cleaned document when the output is long.
  • Treat 40 or more questions, or any obviously long output, as a reason to prefer a `.md` file over a chat message when the interface supports it.
  • If the interface does not actually produce files, return only the raw Markdown document with no preface or trailing explanation.
  • Use the exact same question contract in chat output and in `.md` / `.txt` files so SynapQ can ingest either form.
  • One question at a time, in original order.
  • Each question starts with `## Question N`.
  • Question stem appears as plain text immediately below the heading.
  • Choices stay on separate lines in `A. ...` format.
  • If the source gives only one correct answer text for a standard single-best-answer question, output exactly 4 choices total by adding 3 synthesized distractors around the untouched correct answer.
  • `Visual dependency: yes|no` is always included.
  • Optional metadata is limited to `Answer: X`, `Explanation: ...`, and `Needs review: yes`.
  • If the model can faithfully extract a question image, append it only once in a final section exactly titled `## Extracted Images`.
  • For chat output, prefer a single-line coordinate entry in exactly this format: `Question N: page=12 x0=120 y0=340 x1=820 y1=1260`.
  • For downloadable file artifacts, `Question N: data:image/png;base64,...` is also allowed on one physical line.
  • Do not use bold, bullets, citations, labels in parentheses, or commentary in the image appendix.
  • Do not move the coordinates or data URL onto the next line and do not write anything after the last image line.

What the model should clean up

  • Remove page numbers, scan headers, watermarks, repeated section titles, and duplicated OCR fragments.
  • Rejoin broken line wraps so each question reads like normal prose.
  • Keep exact terminology for diagnoses, drugs, anatomy, and scientific notation.
  • Compress readable tables or figure labels into the minimum text needed for the question.
  • If a figure or table is essential but unclear, mark the question for review instead of guessing.

What the model must not do

  • Do not invent missing options, answer keys, labels, or image content outside the explicit single-answer-only distractor rule.
  • Do not merge adjacent questions that only look close together on the scan.
  • Do not translate, simplify, paraphrase, or modernize the wording.
  • Do not add any introductory note, summary, or closing comment around the Markdown output.
  • Do not wrap the result in JSON, XML, HTML, or markdown code fences.

How the model should recover answers

  • Treat the PDF as a visual document, not as text-only OCR. Use multimodal inspection when the model supports it.
  • Inspect the whole PDF for answer signals, including separate answer-key pages at the end.
  • Do not assume a separate answer key exists. Some PDFs mark the correct choice directly on the question page.
  • Some 'çıkmış' sheets give only the correct answer text with no distractors. Detect that layout explicitly instead of assuming the options were merely lost.
  • Map final answer-key entries back to the correct question numbers and emit `Answer: X` when the mapping is clear.
  • Use explicit visual cues inside the question such as bold, different color, underline, highlight, check marks, or filled markers when they clearly indicate the correct choice.
  • Treat OCR anomalies caused by marking, stamps, or highlights as possible answer signals instead of ignoring them automatically.
  • If inline formatting conflicts with a separate answer key, prefer the explicit answer key.
  • If the PDF clearly supplies only one correct answer for a standard single-best-answer question, keep that answer text unchanged and synthesize exactly 3 plausible distractors from the stem context.
  • Never synthesize distractors for matching, ordering, true/false, multi-statement, multi-answer, or open-ended questions.
  • If the answer is still uncertain, omit `Answer: X` and mark `Needs review: yes` instead of guessing.

Single-answer recall sheets

  • Use this only when the PDF clearly shows a single-best-answer question with one answer text and no existing distractor options.
  • Do not rewrite the question stem.
  • Do not rewrite the original correct answer text from the PDF.
  • Add exactly 3 plausible distractors so the final question has 4 options total.
  • Keep the distractors medically plausible but clearly not better than the original correct answer.
  • Always add `Answer: X` for the preserved correct choice and `Needs review: yes` when distractors were synthesized.
  • Do not use this for matching, ordering, Roman numeral combination, true/false, multi-answer, or open-ended items.

Visual and low-confidence questions

Use `Visual dependency: yes` when:

  • The question depends on an image, chart, pathology slide, ECG, radiology figure, or table.
  • The stem is only understandable with labels or annotations from the source image.
  • The model can summarize the visible context but should not pretend the visual has been fully converted into text.

Use `Needs review: yes` when:

  • A choice is clipped, garbled, or missing.
  • A figure or table is required but unreadable.
  • The scan is too degraded to preserve the original meaning with confidence.

Extracted image appendix

  • Only append extracted images when the model can faithfully recover a question-specific image from the PDF.
  • Do not place image data or crop coordinates inside the main question block. Keep them in a final section exactly titled `## Extracted Images` after the last question.
  • For chat output, prefer one line per image in exactly this format: `Question N: page=12 x0=120 y0=340 x1=820 y1=1260`.
  • For downloadable file artifacts, one line per image in `Question N: data:image/png;base64,...` format is also allowed.
  • Do not use bold, bullets, citations, labels in parentheses, or commentary in that appendix.
  • Do not move the coordinates or data URL onto the next line and do not write anything after the last image line.
  • Use at most one extracted image per question in this exact format.
  • If the image is ambiguous, low-quality, or not clearly tied to a specific question number, skip the appendix entry and mark the question for review instead.
## Question 3
The ECG shown is most consistent with which rhythm?
Visual dependency: yes
A. Atrial fibrillation
B. Ventricular tachycardia
C. Supraventricular tachycardia
D. Sinus tachycardia
Needs review: yes

## Extracted Images
Question 3: page=4 x0=120 y0=340 x1=820 y1=1260

Good output example

## Question 1
A 62-year-old patient presents with sudden painless vision loss in the right eye. Which diagnosis is most likely?
Visual dependency: no
A. Central retinal artery occlusion
B. Optic neuritis
C. Acute angle-closure glaucoma
D. Retinal migraine
Answer: A

## Question 2
The lesion shown in the fundus image is most consistent with which condition?
Visual dependency: yes
A. Diabetic retinopathy
B. Choroidal melanoma
C. Retinal detachment
D. Hypertensive retinopathy
Needs review: yes

Synthesized distractor example

## Question 35
Aşağıda yazılan yasaklı madde ve yöntemlerden hangisi müsabaka dışı dönemlerde kullanıldığı zaman doping olarak kabul edilmez?
Visual dependency: no
A. Beta-2 agonistler
B. Narkotik analjezikler
C. Eritropoetin
D. Anabolik ajanlar
Answer: B
Needs review: yes

Bad output example

Question 1) Sudden vision loss question
A) CRAO
B) ON
C) glaucoma
D) migraine

Question 2 uses an image and is probably diabetic retinopathy.

This fails because it drops the required question heading format, skips the visual flag, shortens medical terms, and invents a probable answer for the second question.

Analytics cookies

We use Google Analytics with consent mode to understand which pages help and where the product breaks down. Accepting enables full analytics storage. Rejecting keeps analytics storage denied and limits measurement. Privacy Policy