Which model is better for research accuracy and reasoning?

ChatGPT has a slight edge on general reasoning benchmarks (GPQA Diamond: 92.8% vs 88.4%), while Qwen excels at mathematical reasoning (AIME 2025: 91.3%). Both are strong for research; your choice depends on whether your work emphasizes pure logic or mathematics.

Is ChatGPT or Qwen more cost-effective for research?

Qwen is substantially cheaper—API pricing is 6-10x lower than ChatGPT ($0.40/$2.40 vs $2.50/$15.00 per million tokens). For research projects involving large-scale analysis or multiple iterations, Qwen's affordability makes it ideal for budget-constrained teams.

Which is better for multilingual research?

Qwen dominates multilingual research with exceptional Chinese and cross-language capabilities. If your research involves non-English sources, international collaboration, or multilingual data, Qwen's language strengths significantly outperform ChatGPT.

Can I access current sources and real-time information for my research?

ChatGPT includes web search, enabling access to the latest papers, data, and findings. Qwen lacks this feature, relying instead on training data and your manually provided sources—though its 256K token context window excels at analyzing lengthy documents offline.

Compare ChatGPT vs Qwen

ChatGPT vs Qwen for Research

For active research, ChatGPT's web search capability is a critical advantage that justifies its cost premium—researchers need access to current papers, findings, and real-time data. Qwen offers exceptional cost-effectiveness and competitive reasoning benchmarks for analyzing large documents and papers, but its lack of web browsing severely limits its usefulness for staying current with the latest research developments.

Head-to-Head for Research

Criteria	ChatGPT	Qwen	Winner
Web Search & Source Access	Built-in web search for current sources	No web search capability	ChatGPT
Complex Reasoning Tasks	92.8% on GPQA Diamond benchmark	88.4% on GPQA Diamond benchmark	ChatGPT
Global Research Scale	$2.50-$15.00 per 1M tokens (expensive)	$0.40-$2.40 per 1M tokens (very affordable)	Qwen
Multilingual Research Coverage	Good general multilingual support	Exceptional multilingual, especially Chinese	Qwen
Long Document Analysis	272K token context window	256K token context window	Tie

Web Search & Source Access

ChatGPT

Built-in web search for current sources

Qwen

No web search capability

Complex Reasoning Tasks

ChatGPT

92.8% on GPQA Diamond benchmark

Qwen

88.4% on GPQA Diamond benchmark

Global Research Scale

Qwen

ChatGPT

$2.50-$15.00 per 1M tokens (expensive)

Qwen

$0.40-$2.40 per 1M tokens (very affordable)

Multilingual Research Coverage

Qwen

ChatGPT

Good general multilingual support

Qwen

Exceptional multilingual, especially Chinese

Long Document Analysis

Tie

ChatGPT

272K token context window

Qwen

256K token context window

See Them Answer a Research Question

We asked both models the same research-focused prompt.

“Synthesize the latest peer-reviewed findings on CRISPR-Cas9 off-target effects in therapeutic applications and identify the key unresolved questions in the field.”

GPT-5.4

Synthesize the latest peer-reviewed findings on CRISPR-Cas9 off-target effects in therapeutic applications and identify the key unresolved questions in the field.

Here is a concise synthesis of the current peer-reviewed literature on CRISPR-Cas9 off-target effects in therapeutic applications, followed by the main unresolved questions.

Executive summary

The field has moved from asking whether off-target editing occurs to asking:

How often clinically meaningful off-target events occur in real therapeutic settings
Which detection methods best predict patient risk
What kinds of genomic damage matter most—small indels, large deletions, chromosomal rearrangements, translocations, vector/genome insertions, or p53-linked selection
How to balance editing efficiency against safety in different delivery contexts

The latest peer-reviewed findings suggest that:

Off-target editing can often be reduced to very low levels with careful guide design, high-fidelity Cas9 variants, optimized delivery, and transient nuclease exposure.
However, rare but potentially consequential events remain a major concern, especially in therapeutic settings where even a low-frequency event may matter if it occurs in long-lived stem cells, T cells, hepatocytes, or regenerating tissues.
The most important safety issue is no longer just classic off-target indels at sequence-similar loci, but also unexpected on-target and off-target structural variants, including large deletions, inversions, translocations, and integration of exogenous DNA.
Ex vivo therapies are currently easier to risk-manage than in vivo therapies, because edited cells can be profiled before infusion.
The field increasingly recognizes that no single assay is sufficient; orthogonal combinations of prediction, cell-based detection, and unbiased genome-wide methods are needed.

1. What the latest literature shows

A. Canonical off-target cutting is real but can often be made rare

Recent peer-reviewed work confirms that SpCas9 can cleave genomic sites with partial homology to the guide RNA, especially when mismatches occur in PAM-distal regions or when chromatin accessibility is favorable. But the practical message from newer therapeutic studies is:

With modern guide selection pipelines
Use of high-fidelity Cas9 enzymes such as SpCas9-HF1, eSpCas9(1.1), HypaCas9, HiFi Cas9, or related engineered variants
And RNP or mRNA delivery rather than prolonged expression from viral vectors

…off-target indels are often reduced below standard detection thresholds or into low-frequency ranges that may be acceptable depending on indication.

This has been especially evident in:

Ex vivo hematopoietic stem/progenitor cell editing
T-cell engineering
Some liver-directed in vivo editing programs

Still, “low” is not the same as “zero,” and some studies continue to find guide-dependent off-target sites even with high-fidelity nucleases.

B. Delivery method strongly shapes off-target risk

One of the clearest conclusions from the literature is that delivery kinetics matter.

Lower-risk patterns

Cas9 RNP delivery
Transient mRNA delivery
Short intracellular nuclease exposure

These generally reduce off-target cutting because the editing window is brief.

Higher-risk patterns

Persistent expression, especially with some viral delivery systems
High intracellular nuclease concentration
Repeated dosing or prolonged exposure

In therapeutic terms:

Ex vivo RNP editing is generally viewed as safer from an off-target perspective than prolonged in vivo viral expression.
For AAV-based donor delivery or systems with sustained Cas9 production, concern remains that prolonged nuclease activity may increase cumulative off-target damage.

C. Chromatin state and cell type matter more than sequence alone

Newer studies emphasize that off-target potential is not dictated solely by sequence similarity.

Important determinants include:

Chromatin accessibility
DNA repair state
Cell-cycle status
Cell type–specific DNA damage responses
Guide RNA expression level and scaffold design

As a result, a guide that appears safe in immortalized screening cells may behave differently in:

Primary human HSPCs
T cells
Hepatocytes
Retinal cells
Muscle tissue
Neurons

This is one reason regulators and translational groups increasingly favor testing in therapeutically relevant primary cells rather than relying only on in silico prediction or standard cell lines.

D. The field has broadened from “off-target indels” to “genome integrity”

A major shift in the literature is that the most worrisome adverse events may not be classic small off-target insertions/deletions.

Increasingly recognized damage categories

Large deletions at on-target sites
Complex local rearrangements
Chromosomal translocations, especially when multiplex editing is used
Chromothripsis-like events in rare contexts
AAV or plasmid fragment integration at cut sites
Loss of heterozygosity
Capture of genomic fragments from other loci
Unexpected repair outcomes beyond simple NHEJ

These events can be rare and technically difficult to detect, but they are highly relevant for therapeutic risk because they may:

Disrupt tumor suppressor genes
Activate oncogenes
Alter genome structure in long-lived cells
Create clonal growth advantages

This is now one of the central safety concerns in therapeutic genome editing.

E. Double-strand breaks remain the core source of safety concerns

The strongest consensus in the current literature is that double-strand-break-based editing itself is the central risk driver, both at on-target and off-target sites.

This has motivated interest in:

Base editors
Prime editors
Nickase-based approaches
CRISPR-associated transposase systems
Other lower-break or break-free editing methods

However, these alternatives do not eliminate safety concerns; they shift them:

Base editors can cause guide-independent or guide-dependent deamination
Prime editors may create indels, scaffold-derived insertions, or rare structural changes
RNA-guided systems may still have sequence-dependent and context-dependent off-target activities

So while alternative editors often reduce double-strand-break-associated translocations and large deletions, they introduce distinct safety profiles.

F. High-fidelity Cas9 variants improve specificity, but tradeoffs remain

Peer-reviewed comparisons generally support that high-fidelity Cas9 variants reduce off-target cutting substantially, often without eliminating on-target activity. But performance is context dependent.

Observed tradeoffs:

Some high-fidelity variants show reduced activity at difficult targets
Some guides perform poorly after fidelity-enhancing substitutions
The “best” nuclease can vary with:
- target sequence
- cell type
- delivery method
- therapeutic editing threshold

The current practical view is that high-fidelity nucleases should often be considered the default starting point for therapeutic development, but empirical testing remains essential.

G. Detection technology has improved, but no assay fully captures clinical risk

Recent peer-reviewed work has refined multiple off-target detection methods, each with strengths and blind spots.

Common methods and what they reveal

GUIDE-seq: sensitive in cell systems, useful for DSB mapping, but requires efficient tag integration and may not work equally well in primary therapeutic cells
CIRCLE-seq / CHANGE-seq / SITE-seq: highly sensitive in vitro biochemical profiling; can overcall sites that are not edited in vivo
DISCOVER-seq: uses DNA repair recruitment markers in living cells; more physiologic, but not universally applicable
Amplicon sequencing: excellent for validating known candidate sites, but not discovery
Whole-genome sequencing: useful for large events or clonal analysis, but not sensitive enough alone for low-frequency rare off-target indels
Long-read sequencing: increasingly important for large deletions, inversions, vector insertions, and complex rearrangements

Consensus from recent studies:

In vitro methods are sensitive but can overestimate
In-cell methods are more biologically relevant but may miss rare events
WGS alone is inadequate for comprehensive off-target assessment
Combinatorial workflows are now considered best practice

H. Ex vivo clinical applications look more controllable than in vivo applications

This is one of the clearest translational conclusions.

Ex vivo editing

Examples include:

HSPCs for hemoglobinopathies
CAR-T and engineered T-cell therapies

Advantages:

Cells can be assayed before infusion
Editing conditions are tightly controlled
Clonal or bulk genomic integrity assessments are possible
Damaged products may be discarded

In vivo editing

Examples include:

Liver
Eye
Muscle
CNS

Challenges:

Harder to measure real editing outcomes in all edited cells
Tissue biopsies are limited
Rare off-target events may evade detection
Long-term surveillance is more difficult
Delivery often introduces prolonged or heterogeneous exposure

For this reason, peer-reviewed commentary and translational studies generally conclude that in vivo therapeutic editing still faces a higher evidentiary burden for off-target risk assessment.

2. What clinical and preclinical studies are indicating

Across therapeutic programs, the latest data support several broad conclusions:

Clinically advanced ex vivo Cas9 therapies have so far shown reassuring short-term safety signals, with no clear evidence that off-target editing has emerged as a dominant clinical toxicity.
But follow-up durations remain limited relative to the possibility of:
- insertional oncogenesis–like phenomena
- clonal expansion
- delayed malignant transformation
- stem-cell selection effects

Especially in HSPCs, the key question is not just whether off-target edits can be detected, but whether rare edited clones with growth advantages could emerge years later.

Similarly, in T-cell therapies:

multiplex editing raises concern for translocations
manufacturing controls have improved
but product-specific structural genomics still matters

In liver-directed editing:

hepatocyte turnover and regenerative biology complicate risk projection
even low-frequency genomic alterations may be amplified under selective pressure

3. Key unresolved questions

1. What frequency of off-target editing is clinically meaningful?

This remains perhaps the most important unanswered question.

A very rare event may be irrelevant in:

short-lived differentiated cells
non-expanding tissues

But potentially serious in:

stem cells
progenitors
memory T cells
regenerating organs

The field still lacks robust, indication-specific thresholds for:

acceptable off-target indel rates
acceptable structural variant rates
acceptable translocation frequencies

2. Which off-target events are most dangerous: small indels or rare structural variants?

Evidence increasingly suggests that rare structural abnormalities may matter more than low-level small indels.

But unresolved questions include:

How often large deletions or translocations occur in therapeutic settings
Which of these persist in vivo
Which can drive oncogenesis or dysfunction
Whether current release testing adequately captures them

3. How predictive are current preclinical detection assays of patient risk?

A major gap is the relationship between:

assay-positive sites in vitro
actual editing in patient tissues
long-term clinical consequences

The field still does not know:

which assay or assay combination best predicts real-world risk
whether some current workflows are overly conservative
whether others systematically miss dangerous events

4. How should on-target genotoxicity be integrated into “off-target” safety frameworks?

A major conceptual issue is that many serious risks arise at the intended target site, including:

large deletions
inversions
translocations
loss of regulatory architecture

So the old distinction between “on-target efficacy” and “off-target safety” is no longer adequate. The unresolved question is how to create unified genome integrity standards for therapeutic editing.

5. What is the long-term cancer risk?

This is still unknown.

Open questions:

Does Cas9 editing increase long-term malignancy risk in edited stem/progenitor compartments?
Are p53-pathway responses selecting for clones with DNA damage tolerance?
Can rare rearrangements or integrations create latent oncogenic risk?
How many years of surveillance are needed?

Current human follow-up is encouraging but still too short to fully answer these questions.

6. How generalizable are safety findings across tissues and diseases?

A guide/nuclease/delivery combination that is safe in one context may not be safe in another.

Unknowns include:

tissue-specific DNA repair outcomes
developmental-stage effects
disease-specific chromatin landscapes
inflammatory effects on editing specificity
species-to-species differences in preclinical models

This limits extrapolation from one program to another.

7. How should multiplex editing be made safe enough for broader clinical use?

Multiplex therapeutic editing is attractive for:

allogeneic cell therapies
immune evasion engineering
complex disease correction

But it increases the risk of:

translocations
multi-locus rearrangements
cumulative DNA damage burden

The field still lacks universal standards for acceptable structural variant burdens in multiplex products.

8. Can non-DSB editors truly solve the off-target problem?

Base and prime editing are often presented as safer alternatives, but unresolved issues include:

low-frequency genome-wide off-target activity
RNA off-target effects for some deaminase systems
bystander edits
large deletions or rearrangements in some contexts
pegRNA/scaffold-related insertions in prime editing

So the unresolved question is not whether they are “safe,” but which editor is safest for which therapeutic context.

9. What should regulatory-grade off-target assessment look like?

There is no single universally accepted standard.

Still unsettled:

minimum assay battery for IND-enabling work
role of long-read sequencing
use of orthogonal methods
acceptable limits of detection
necessity of patient-specific or genotype-specific testing
release criteria for ex vivo products

This is as much a translational and regulatory science question as a biological one.

10. Can computational prediction become truly reliable?

Machine-learning and biophysical models have improved guide design, but they still struggle to fully capture:

chromatin context
repair outcomes
cell-type dependence
structural variant propensity

A major unresolved challenge is building predictors that are accurate enough to reduce experimental burden without missing rare high-consequence events.

4. Current emerging consensus

If one synthesizes the latest literature, the field appears to be converging on several points:

Cas9 off-target effects are manageable but not solved.
The biggest risks are broader than simple off-target indels.
Transient delivery and high-fidelity nucleases are now central safety strategies.
Ex vivo editing is presently more mature and controllable than in vivo editing.
Structural genomic damage and long-term clonal consequences are the critical frontier.
No single detection assay is sufficient; orthogonal profiling is essential.
Therapeutic safety must be evaluated in the relevant cell type and delivery context.

5. Bottom line

The latest peer-reviewed evidence suggests that CRISPR-Cas9 therapeutic editing has become substantially safer through better guide design, improved Cas9 variants, and transient delivery methods. Yet the field’s main unresolved problems are now concentrated in rare, hard-to-detect, potentially high-consequence genomic events and in the uncertain long-term clinical significance of those events.

In short:

Routine off-target cutting can often be minimized
Genome integrity risk remains incompletely characterized
Long-term human safety data are still the key missing piece

If you want, I can also provide this in one of three more specific formats:

A literature-review style summary with recent citations
A table comparing off-target detection methods and their limitations
A therapeutic-context breakdown: ex vivo HSCs, T cells, liver, eye, muscle, CNS.

Qwen3.5 Plus

Synthesize the latest peer-reviewed findings on CRISPR-Cas9 off-target effects in therapeutic applications and identify the key unresolved questions in the field.

Executive Summary

The landscape of CRISPR-Cas9 therapeutic safety has shifted significantly between 2022 and 2024. While early concerns focused on the frequency of off-target effects (OTEs), recent peer-reviewed literature emphasizes the nature of genomic damage (e.g., large structural variations) and the limitations of detection in clinical settings. With the approval of the first CRISPR therapy (Casgevy/exagamglogene autotemcel) and advancing in vivo trials (e.g., NTLA-2001), the field has moved from theoretical risk assessment to empirical clinical surveillance. Current consensus suggests that while OTEs are rarer in primary human cells than in transformed cell lines, the potential for complex genomic rearrangements and clonal expansion of edited cells remains a critical safety hurdle.

1. Synthesis of Latest Peer-Reviewed Findings (2022–2024)

A. The Nature of Off-Target Damage: Beyond Small Indels

Historically, OTEs were characterized as small insertions or deletions (indels) at sites with sequence homology to the guide RNA. Recent genomic sequencing studies have expanded this definition:

Large Deletions and Rearrangements: Multiple studies (e.g., Nature Biotechnology, Genome Medicine) confirm that Cas9-induced double-strand breaks (DSBs) can cause kilobase-scale deletions and chromothripsis (chromosomal shattering) at the on-target site, which can be indistinguishable from off-target effects in short-read sequencing.
Translocations: In ex vivo editing of hematopoietic stem and progenitor cells (HSPCs), inter-chromosomal translocations between the target locus and frequent off-target sites have been detected, though at low frequencies (<0.1%).
p53-Mediated Selection: A persistent finding is that functional p53 acts as a barrier to editing; cells with compromised p53 pathways may survive editing more readily. Recent 2023 analyses suggest that while this selection pressure exists, rigorous screening of edited cell populations prior to infusion can mitigate the risk of expanding pre-malignant clones.

B. Advances in Detection Sensitivity

The reliability of OTE assessment depends entirely on the detection method. The field has standardized around several high-sensitivity assays:

CHANGE-seq and CIRCLE-seq: These in vitro methods have superseded older techniques (like GUIDE-seq) for preclinical screening due to higher sensitivity and the ability to work without cellular delivery of detection tags.
DISCOVER-Seq: This method leverages the cell's own DNA repair machinery (MRE11 recruitment) to identify off-target sites in vivo. Recent validations show it is highly effective in identifying OTEs in primary cells where GUIDE-seq fails.
Long-Read Sequencing: The integration of PacBio and Oxford Nanopore technologies in 2023/2024 safety studies has been pivotal. Short-read sequencing often misses large structural variants; long-read sequencing has revealed complex rearrangements at target sites that were previously overlooked.

C. Mitigation Strategies: Engineering and Enzymology

To reduce OTEs, three primary strategies have matured:

High-Fidelity Cas9 Variants: Variants such as HiFi Cas9, eSpCas9, and HypaCas9 are now standard in therapeutic pipelines. Recent comparative studies indicate HiFi Cas9 reduces off-target activity to undetectable levels in many contexts without compromising on-target efficiency.
Base and Prime Editing: To avoid DSBs entirely, therapies are shifting toward base editors (BE) and prime editors (PE). While these reduce canonical OTEs, 2023 studies highlight "bystander edits" (unintended base changes within the editing window) and RNA off-target effects (for deaminase-containing editors) as new safety considerations.
Transient Delivery: The use of ribonucleoprotein (RNP) complexes rather than plasmid DNA ensures Cas9 degrades quickly after editing, narrowing the window for off-target cutting. This is now the gold standard for ex vivo therapies.

D. Clinical Evidence: Ex Vivo vs. In Vivo

Ex Vivo (e.g., Casgevy for Sickle Cell/β-Thalassemia): Long-term follow-up data presented in 2023 indicates no detected off-target mutations at predicted sites in infused cells. However, the ex vivo expansion process itself introduces confounding genomic mutations, making it difficult to attribute variants solely to CRISPR.
In Vivo (e.g., Intellia's NTLA-2001 for ATTR): Liver editing presents unique challenges. Recent data shows efficient editing with no serious adverse events linked to OTEs. However, detection is limited to circulating DNA or liver biopsies, meaning OTEs in other tissues (e.g., gonads, brain) remain theoretically possible but unmeasured.

2. Key Unresolved Questions in the Field

Despite progress, significant scientific and regulatory gaps remain.

1. The "Needle in a Haystack" Problem in Vivo

Question: How do we comprehensively screen for OTEs in in vivo therapies without invasive biopsies of all potential tissues?
Challenge: Current methods (liquid biopsy/cfDNA) lack the sensitivity to detect rare OTEs (<0.1% allele frequency) in specific solid organs. If an OTE occurs in 1 out of 1 million hepatocytes, it may drive clonal expansion years later but remain undetectable at the time of treatment.

2. Long-Term Carcinogenicity and Clonal Expansion

Question: Do rare off-target mutations confer a selective advantage that leads to malignancy over a 10–15 year horizon?
Challenge: Most preclinical models monitor safety for 6–12 months. There is no validated animal model that accurately predicts human cancer risk from specific CRISPR-induced genomic scars over a human lifespan. The risk of disrupting a tumor suppressor gene at an unpredicted off-target site remains a theoretical "black swan" event.

3. Standardization of Risk Thresholds

Question: What is the regulatory threshold for "acceptable" off-target activity?
Challenge: Unlike small molecule drugs, where impurity levels are strictly defined, gene editing risks are binary (a mutation either happens or it doesn't) but probabilistic. Regulators (FDA/EMA) have not established a universal cutoff for OTE frequency that guarantees safety, leading to case-by-case adjudication.

4. Immunogenicity of Editing Machinery

Question: Does the immune response to Cas9 proteins influence genomic stability?
Challenge: While primarily an immune concern, inflammation caused by an immune response to Cas9 could theoretically alter DNA repair pathway choices (e.g., favoring error-prone NHEJ over homology-directed repair), potentially increasing the severity of OTEs in an in vivo context.

5. Detection of Epigenetic Off-Target Effects

Question: Can Cas9 binding (even without cutting) alter chromatin structure or gene expression?
Challenge: "Dead" Cas9 (dCas9) is known to affect transcription. Active Cas9 may bind and stall at off-target sites without cutting, potentially recruiting chromatin modifiers or blocking transcription factors. Current safety assays focus on sequence changes, not epigenetic dysregulation.

Conclusion

The field has matured from asking if CRISPR-Cas9 causes off-target effects to quantifying which effects are clinically relevant. The convergence of high-fidelity enzymes, RNP delivery, and long-read sequencing has substantially lowered the risk profile for ex vivo therapies. However, in vivo applications face unresolved challenges regarding tissue-specific detection and long-term oncogenic surveillance. The next phase of research must prioritize the development of non-invasive, ultra-sensitive monitoring tools and the establishment of longitudinal patient registries to track genomic safety over decades.

GPT-5.4

Synthesize the latest peer-reviewed findings on CRISPR-Cas9 off-target effects in therapeutic applications and identify the key unresolved questions in the field.

Here is a concise synthesis of the current peer-reviewed literature on CRISPR-Cas9 off-target effects in therapeutic applications, followed by the main unresolved questions.

Executive summary

The field has moved from asking whether off-target editing occurs to asking:

How often clinically meaningful off-target events occur in real therapeutic settings
Which detection methods best predict patient risk
What kinds of genomic damage matter most—small indels, large deletions, chromosomal rearrangements, translocations, vector/genome insertions, or p53-linked selection
How to balance editing efficiency against safety in different delivery contexts

The latest peer-reviewed findings suggest that:

Off-target editing can often be reduced to very low levels with careful guide design, high-fidelity Cas9 variants, optimized delivery, and transient nuclease exposure.
However, rare but potentially consequential events remain a major concern, especially in therapeutic settings where even a low-frequency event may matter if it occurs in long-lived stem cells, T cells, hepatocytes, or regenerating tissues.
The most important safety issue is no longer just classic off-target indels at sequence-similar loci, but also unexpected on-target and off-target structural variants, including large deletions, inversions, translocations, and integration of exogenous DNA.
Ex vivo therapies are currently easier to risk-manage than in vivo therapies, because edited cells can be profiled before infusion.
The field increasingly recognizes that no single assay is sufficient; orthogonal combinations of prediction, cell-based detection, and unbiased genome-wide methods are needed.

1. What the latest literature shows

A. Canonical off-target cutting is real but can often be made rare

With modern guide selection pipelines
Use of high-fidelity Cas9 enzymes such as SpCas9-HF1, eSpCas9(1.1), HypaCas9, HiFi Cas9, or related engineered variants
And RNP or mRNA delivery rather than prolonged expression from viral vectors

…off-target indels are often reduced below standard detection thresholds or into low-frequency ranges that may be acceptable depending on indication.

This has been especially evident in:

Ex vivo hematopoietic stem/progenitor cell editing
T-cell engineering
Some liver-directed in vivo editing programs

Still, “low” is not the same as “zero,” and some studies continue to find guide-dependent off-target sites even with high-fidelity nucleases.

B. Delivery method strongly shapes off-target risk

One of the clearest conclusions from the literature is that delivery kinetics matter.

Lower-risk patterns

Cas9 RNP delivery
Transient mRNA delivery
Short intracellular nuclease exposure

These generally reduce off-target cutting because the editing window is brief.

Higher-risk patterns

Persistent expression, especially with some viral delivery systems
High intracellular nuclease concentration
Repeated dosing or prolonged exposure

In therapeutic terms:

Ex vivo RNP editing is generally viewed as safer from an off-target perspective than prolonged in vivo viral expression.
For AAV-based donor delivery or systems with sustained Cas9 production, concern remains that prolonged nuclease activity may increase cumulative off-target damage.

C. Chromatin state and cell type matter more than sequence alone

Newer studies emphasize that off-target potential is not dictated solely by sequence similarity.

Important determinants include:

Chromatin accessibility
DNA repair state
Cell-cycle status
Cell type–specific DNA damage responses
Guide RNA expression level and scaffold design

As a result, a guide that appears safe in immortalized screening cells may behave differently in:

Primary human HSPCs
T cells
Hepatocytes
Retinal cells
Muscle tissue
Neurons

D. The field has broadened from “off-target indels” to “genome integrity”

A major shift in the literature is that the most worrisome adverse events may not be classic small off-target insertions/deletions.

Increasingly recognized damage categories

Large deletions at on-target sites
Complex local rearrangements
Chromosomal translocations, especially when multiplex editing is used
Chromothripsis-like events in rare contexts
AAV or plasmid fragment integration at cut sites
Loss of heterozygosity
Capture of genomic fragments from other loci
Unexpected repair outcomes beyond simple NHEJ

These events can be rare and technically difficult to detect, but they are highly relevant for therapeutic risk because they may:

Disrupt tumor suppressor genes
Activate oncogenes
Alter genome structure in long-lived cells
Create clonal growth advantages

This is now one of the central safety concerns in therapeutic genome editing.

E. Double-strand breaks remain the core source of safety concerns

The strongest consensus in the current literature is that double-strand-break-based editing itself is the central risk driver, both at on-target and off-target sites.

This has motivated interest in:

Base editors
Prime editors
Nickase-based approaches
CRISPR-associated transposase systems
Other lower-break or break-free editing methods

However, these alternatives do not eliminate safety concerns; they shift them:

Base editors can cause guide-independent or guide-dependent deamination
Prime editors may create indels, scaffold-derived insertions, or rare structural changes
RNA-guided systems may still have sequence-dependent and context-dependent off-target activities

So while alternative editors often reduce double-strand-break-associated translocations and large deletions, they introduce distinct safety profiles.

F. High-fidelity Cas9 variants improve specificity, but tradeoffs remain

Observed tradeoffs:

Some high-fidelity variants show reduced activity at difficult targets
Some guides perform poorly after fidelity-enhancing substitutions
The “best” nuclease can vary with:
- target sequence
- cell type
- delivery method
- therapeutic editing threshold

The current practical view is that high-fidelity nucleases should often be considered the default starting point for therapeutic development, but empirical testing remains essential.

G. Detection technology has improved, but no assay fully captures clinical risk

Recent peer-reviewed work has refined multiple off-target detection methods, each with strengths and blind spots.

Common methods and what they reveal

GUIDE-seq: sensitive in cell systems, useful for DSB mapping, but requires efficient tag integration and may not work equally well in primary therapeutic cells
CIRCLE-seq / CHANGE-seq / SITE-seq: highly sensitive in vitro biochemical profiling; can overcall sites that are not edited in vivo
DISCOVER-seq: uses DNA repair recruitment markers in living cells; more physiologic, but not universally applicable
Amplicon sequencing: excellent for validating known candidate sites, but not discovery
Whole-genome sequencing: useful for large events or clonal analysis, but not sensitive enough alone for low-frequency rare off-target indels
Long-read sequencing: increasingly important for large deletions, inversions, vector insertions, and complex rearrangements

Consensus from recent studies:

In vitro methods are sensitive but can overestimate
In-cell methods are more biologically relevant but may miss rare events
WGS alone is inadequate for comprehensive off-target assessment
Combinatorial workflows are now considered best practice

H. Ex vivo clinical applications look more controllable than in vivo applications

This is one of the clearest translational conclusions.

Ex vivo editing

Examples include:

HSPCs for hemoglobinopathies
CAR-T and engineered T-cell therapies

Advantages:

Cells can be assayed before infusion
Editing conditions are tightly controlled
Clonal or bulk genomic integrity assessments are possible
Damaged products may be discarded

In vivo editing

Examples include:

Liver
Eye
Muscle
CNS

Challenges:

Harder to measure real editing outcomes in all edited cells
Tissue biopsies are limited
Rare off-target events may evade detection
Long-term surveillance is more difficult
Delivery often introduces prolonged or heterogeneous exposure

For this reason, peer-reviewed commentary and translational studies generally conclude that in vivo therapeutic editing still faces a higher evidentiary burden for off-target risk assessment.

2. What clinical and preclinical studies are indicating

Across therapeutic programs, the latest data support several broad conclusions:

Clinically advanced ex vivo Cas9 therapies have so far shown reassuring short-term safety signals, with no clear evidence that off-target editing has emerged as a dominant clinical toxicity.
But follow-up durations remain limited relative to the possibility of:
- insertional oncogenesis–like phenomena
- clonal expansion
- delayed malignant transformation
- stem-cell selection effects

Especially in HSPCs, the key question is not just whether off-target edits can be detected, but whether rare edited clones with growth advantages could emerge years later.

Similarly, in T-cell therapies:

multiplex editing raises concern for translocations
manufacturing controls have improved
but product-specific structural genomics still matters

In liver-directed editing:

hepatocyte turnover and regenerative biology complicate risk projection
even low-frequency genomic alterations may be amplified under selective pressure

3. Key unresolved questions

1. What frequency of off-target editing is clinically meaningful?

This remains perhaps the most important unanswered question.

A very rare event may be irrelevant in:

short-lived differentiated cells
non-expanding tissues

But potentially serious in:

stem cells
progenitors
memory T cells
regenerating organs

The field still lacks robust, indication-specific thresholds for:

acceptable off-target indel rates
acceptable structural variant rates
acceptable translocation frequencies

2. Which off-target events are most dangerous: small indels or rare structural variants?

Evidence increasingly suggests that rare structural abnormalities may matter more than low-level small indels.

But unresolved questions include:

How often large deletions or translocations occur in therapeutic settings
Which of these persist in vivo
Which can drive oncogenesis or dysfunction
Whether current release testing adequately captures them

3. How predictive are current preclinical detection assays of patient risk?

A major gap is the relationship between:

assay-positive sites in vitro
actual editing in patient tissues
long-term clinical consequences

The field still does not know:

which assay or assay combination best predicts real-world risk
whether some current workflows are overly conservative
whether others systematically miss dangerous events

4. How should on-target genotoxicity be integrated into “off-target” safety frameworks?

A major conceptual issue is that many serious risks arise at the intended target site, including:

large deletions
inversions
translocations
loss of regulatory architecture

5. What is the long-term cancer risk?

This is still unknown.

Open questions:

Does Cas9 editing increase long-term malignancy risk in edited stem/progenitor compartments?
Are p53-pathway responses selecting for clones with DNA damage tolerance?
Can rare rearrangements or integrations create latent oncogenic risk?
How many years of surveillance are needed?

Current human follow-up is encouraging but still too short to fully answer these questions.

6. How generalizable are safety findings across tissues and diseases?

A guide/nuclease/delivery combination that is safe in one context may not be safe in another.

Unknowns include:

tissue-specific DNA repair outcomes
developmental-stage effects
disease-specific chromatin landscapes
inflammatory effects on editing specificity
species-to-species differences in preclinical models

This limits extrapolation from one program to another.

7. How should multiplex editing be made safe enough for broader clinical use?

Multiplex therapeutic editing is attractive for:

allogeneic cell therapies
immune evasion engineering
complex disease correction

But it increases the risk of:

translocations
multi-locus rearrangements
cumulative DNA damage burden

The field still lacks universal standards for acceptable structural variant burdens in multiplex products.

8. Can non-DSB editors truly solve the off-target problem?

Base and prime editing are often presented as safer alternatives, but unresolved issues include:

low-frequency genome-wide off-target activity
RNA off-target effects for some deaminase systems
bystander edits
large deletions or rearrangements in some contexts
pegRNA/scaffold-related insertions in prime editing

So the unresolved question is not whether they are “safe,” but which editor is safest for which therapeutic context.

9. What should regulatory-grade off-target assessment look like?

There is no single universally accepted standard.

Still unsettled:

minimum assay battery for IND-enabling work
role of long-read sequencing
use of orthogonal methods
acceptable limits of detection
necessity of patient-specific or genotype-specific testing
release criteria for ex vivo products

This is as much a translational and regulatory science question as a biological one.

10. Can computational prediction become truly reliable?

Machine-learning and biophysical models have improved guide design, but they still struggle to fully capture:

chromatin context
repair outcomes
cell-type dependence
structural variant propensity

A major unresolved challenge is building predictors that are accurate enough to reduce experimental burden without missing rare high-consequence events.

4. Current emerging consensus

If one synthesizes the latest literature, the field appears to be converging on several points:

Cas9 off-target effects are manageable but not solved.
The biggest risks are broader than simple off-target indels.
Transient delivery and high-fidelity nucleases are now central safety strategies.
Ex vivo editing is presently more mature and controllable than in vivo editing.
Structural genomic damage and long-term clonal consequences are the critical frontier.
No single detection assay is sufficient; orthogonal profiling is essential.
Therapeutic safety must be evaluated in the relevant cell type and delivery context.

5. Bottom line

In short:

Routine off-target cutting can often be minimized
Genome integrity risk remains incompletely characterized
Long-term human safety data are still the key missing piece

If you want, I can also provide this in one of three more specific formats:

A literature-review style summary with recent citations
A table comparing off-target detection methods and their limitations
A therapeutic-context breakdown: ex vivo HSCs, T cells, liver, eye, muscle, CNS.

Try research tasks with both models

See ChatGPT and Qwen answer side by side in Multichat

Try it yourself

Detailed Breakdown

When it comes to research tasks, ChatGPT and Qwen take meaningfully different approaches — and the right choice depends heavily on what kind of research you're doing.

ChatGPT's biggest advantage for researchers is its live web search capability. When you need up-to-date information — recent papers, current statistics, breaking developments in a field — ChatGPT can query the web in real time and synthesize findings directly in conversation. Combined with file uploads, you can feed it PDFs of academic papers, reports, or datasets and ask it to extract key arguments, compare methodologies, or summarize findings. Code execution adds another layer: statistical analysis, data visualization, and running calculations are all possible within the same workflow. Its GPQA Diamond score of 92.8% reflects genuinely strong graduate-level scientific reasoning, making it reliable for technically demanding research questions. The canvas feature also helps when you need to iteratively draft a literature review or research summary.

Qwen's case for research centers on depth over breadth. Its 256K context window means you can load extremely long documents — entire research reports, multi-chapter theses, or extensive codebases — and ask nuanced questions across the full text without losing coherence. Its AIME 2025 score of 91.3% signals strong mathematical reasoning, which matters if your research involves quantitative work. Qwen is also exceptionally capable in multilingual contexts, making it the clear choice for researchers working with Chinese-language sources, cross-regional studies, or international academic literature. Cost is another factor: at roughly $0.40 per million input tokens versus ChatGPT's ~$2.50, heavy API usage for large-scale document analysis becomes far more feasible with Qwen.

The practical gap, however, is significant: Qwen lacks web search, file uploads, and code execution — three features that form the backbone of many modern research workflows. If you rely on pulling live sources, running data analysis, or uploading PDFs directly, Qwen simply can't match ChatGPT's out-of-the-box capability.

For most researchers, ChatGPT is the stronger all-around tool — particularly for exploratory research, literature synthesis, and any work requiring current information or data analysis. Qwen shines in specific scenarios: processing very long documents at scale, working with Chinese-language content, or running cost-sensitive pipelines where you're making thousands of API calls.

Recommendation: Choose ChatGPT for general research workflows, especially if web access and file uploads matter. Choose Qwen if you're doing multilingual research, need to process very long texts affordably, or are building a research pipeline on a budget.

Frequently Asked Questions

Try research tasks with ChatGPT and Qwen

Compare in Multichat

Join 10,000+ professionals who use Multichat