Evidence Scores: Essential for Drawing Sound Conclusions

David Hume said, said that we must proportion our beliefs to the evidence. This requires math. We simply cannot arrive at sound conclusions without ranking the evidence. To differentiate “Bad” from “Good” requires at least a binary scoring system (i.e., bad = 0, and good = 1). However, consider the mathematical representation the terms “moron,” “stupid,” “dull,” “dim,” “average,” “smart,” and “genius.” A computer might think of these terms as an intelligence scale of 1 to 7. These are all discussing the same subject but assigning scores of a seven-category spectrum.

If we want to successfully move into the future, we must distinguish between good and bad ideas, and unfortunately, the present level of discourse falls short. Why not try a system that ties conclusion scores to the relative strength of pro/con arguments? At first, the numbers can be arbitrary. Initially, the important thing is to ensure that we automatically weaken the strength of any conclusion when we weaken the strength of the supporting arguments and do the inverse when we strengthen the weakening arguments.

What if we employed a system that correlates conclusion scores with the relative strength of pro and con arguments? Initially, the numerical values might be arbitrary, but the critical aspect is creating a dynamic relationship where strengthening or weakening the supporting arguments directly impacts the strength of the conclusion.

Belief scores represent the strength of a belief, calculated by summing the supporting evidence and argument scores while subtracting the opposing evidence and argument scores. Once we group similar ways of saying the same thing, to remove redundancy, and provide separate scores for truth (i.e., either free from logical fallacy or verified), and importance (i.e., arguments can be true, but not relevant related to or impacting the truth of the conclusion), we can start hoping to have conclusion scores that would be correlated, over the long run, with the accepted relative likely truth of each belief.

Belief scores, representing the robustness of a belief, can be computed by adding the scores of supporting evidence and arguments and subtracting those of opposing ones. By grouping synonymous expressions to eliminate redundancy and applying distinct scores for truth (free from logical fallacy or empirically validated) and relevance (an argument may be true without affecting or relating to the truth of the conclusion), we lay the groundwork for conclusion scores that, over time, should correlate with the widely accepted relative likelihood of each belief’s truth.

This systematic approach not only elevates our dialogue but also provides a coherent framework for evaluating ideas, facilitating a more informed and discerning public discourse.

represent various points on a seven-level scale, all referring to the same underlying quality.In our minds we try to subtract the scores of con arguments from the strength of pro arguments, but confirmation and ”my team” biases makes us terrible at this.

Promoting Rigorous and Evidence-Based Decision-Making 

A Measure of the Reliability of Evidence 

The EVS measures the degree to which a belief has been verified by various forms of evidence, including scientific studies, historical trends, social experiments, and anecdotal evidence. The score considers the quantity, similarity, and quality of the evidence, with a focus on independence. The algorithm calculates the Category Weighting (ESIW), Evidence-to-Conclusion Relevance Score (ECRS), Evidence Replication Quantity (ERQ), and Evidence Replication Percentage (ERP) to obtain the full EVS. By using these metrics, the EVS provides a transparent and reliable assessment of the strength of supporting evidence for a belief. 

Calculating the EVS

To calculate this score, we assess the relative strength of each “evidence” proposed as reasons to strengthen or weaken a belief. The score considers the quantity and similarity of scenarios tested, the number of replications, and the degree of similarity. The score also considers the quality of the studies or evidence, including bias, methodology, and sample size.

Evidence Source Independence Weighting (ESIW)

This weighting is a critical component of our evaluation process that helps us determine the reliability of different types of evidence. Our weighting algorithm assigns scores to each type of evidence based on their level of independence.

To ensure transparency, we have implemented two separate pro-con arguments with up/down votes and other measures to promote and measure the quality of arguments. This helps us determine our confidence level in the appropriateness of the chosen category for each piece of evidence.

The reliability rankings of different types of evidence, sorted from most to least reliable (based on our current scoring system), are as follows:

  • Statistics and Data with links to sources
  • Formal scientific studies and results from experiments or trials (Meta-analysis, Systematic review, Randomized controlled trial (double-blind, single-blind), Cohort study
  • Case-control study, Cross-sectional study, Longitudinal study, Observational study, Correlational study, Experimental study, and Quasi-experimental study. Each should have links to the published results.)
  • Proposed historical trends (with references to data from history)
  • Expert testimony from relevant authorities, official documents, reports, and published claims (with evidence to support the causal relationships)
  • Expert and social media claims
  • Personal experience or anecdotal evidence
  • Common sense or logical reasoning
  • Analogies or metaphors
  • Cultural or social norms
  • Intuition or gut feeling (based on evolved or adaptive ethics and morals)
  • News articles or media reports
  • Survey data or public opinion polls
  • Eye-witness testimony
  • Visual evidence such as photographs or videos
  • Historical artifacts or documents

Rest assured, we’ll show you our math and provide complete transparency throughout our evaluation process, So, you can understand how each piece of evidence is weighted and the impact it has on our overall conclusion. 

Evidence-to-Conclusion Relevance Score (ECRS)

Introducing the Evidence-to-Conclusion Relevance Score (ECRS) – a key metric for the open internet evaluation process. The ECRS is the score given to the relevance of the evidence presented as reasons to support or oppose different conclusions. This score is calculated based on the performance of pro/con sub-arguments that the evidence would necessarily prove the conclusion if, for example, the evidence were infinitely replicated by double-blind scientific methods.

Don’t worry; we won’t leave you wondering how this score is calculated. We’ll show you our math and provide complete transparency throughout our evaluation process.

Evidence Replication Scores

Evidence Replication Quantity (ERQ)

Used to account for the number of times a study or experiment has been replicated. 

Evidence Replication Percentage (ERP)

To illustrate the use of ERQ and ERP, let’s consider a hypothetical scenario in which a study has been conducted multiple times to examine the effects of a certain medication on a particular disease. The ERQ would consider the number of times the study has been replicated, while the ERP would measure the percentage of replications that have produced similar results. By using these metrics, we can more accurately evaluate the reliability of the evidence and make informed decisions based on the strength of the supporting evidence and the reliability of the data.

A Step-by-step breakdown of the algorithm: 

Calculate the Category Weighting (ESIW) for each piece of evidence.

Calculate the Evidence-to-Conclusion Relevance Score (ECRS)

Multiply these scores (ESIW and ECRS) by the quantity (ERQ) and percentage (ERP) for each piece of evidence to obtain the full EVS. 

Multiply the Category Weighting (ESIW) and the Evidence-to-Conclusion Relevance Score (ECRS) by the Evidence Replication Quantity (ERQ) and the Evidence Replication Percentage (ERP) for each piece of evidence to obtain the weighted Evidence Verification Score (EVS) for that piece of evidence.

EVS = ESIW * ERQ * ECRS * ERP

Sum the weighted EVS scores for all pieces of evidence to obtain the overall Evidence Verification Score (EVS) for the belief.

Evidence to Conclusion Linkage Scores

Evidence scores will be based on the quality and number of times the evidence has been verified. 

How

This book explains a framework for building these scores. This framework considers the evidence scores and argument scores. It uses a Franklenesq approach with reasons at the top of the page, reasons to agree in one column, and reasons to disagree in another. It counts the reasons to agree and disagree and outlines a framework for assigning them quality scores. However, if we count reasons to agree and disagree, we will must group “similar ways of saying the same thing” so we don’t double-count the same arguments. Also, arguments need to be more than just true. They need to be relevant. Therefore, we will have reasons to agree and disagree with “if each argument were true, would it necessarily strengthen the conclusion?” The last bit of math is required because evidence can be true, and relevant but unimportant. Therefore, arguments will also need Importance Scores. Even if they are too formal for most personal decisions, they are not too formal for decisions that alter the fate of all known life in the universe. Costs Benefits Analysis (CBA) is the best method for determining whether an argument is “important.” 

Verifiability and Replication

Scientists use double-blind studies, replicate results, and use placebos to identify truth. 

Of course, we won’t promote this website by saying, “Come to The Idea Stock Exchange, we have the truth” you would say, “This is the collective soul of the internet. This is our current best approximation of the strength of the arguments. Check it out. See if we missed anything.” These are the decisions the internet would make if it were a person using such-and-such algorithms. But you can turn the knobs and see how the recommendations change. You are also free to suggest we change the default knob values. 

Code

See here for a link to the code on GitHub:

https://github.com/myklob/ideastockexchange/wiki#evidence-to-conclusion-linkage-score

Linkage

Making Connections: Avoiding Non-Sequiturs with Linkage Scores.

Score to Settle: How Linkage Scores can connect our reasons.

Linkage scores: unlocking the potential of interconnected reasoning.

Score one for rational decision-making with linkage scores.

Linkage scores: the missing link in practical reasoning

Linkage scores: bridging the gap between arguments.

Score big with linkage scores.

Untangling Ideas: Using Linkage Scores to Weave a Rational Web

Unleashing the power of logical connections with linkage scores.

Keep your reasoning intact – use linkage scores. 

Cutting causation confusion with linkage scores 

Algorithm

Given a list of beliefs, allow users to identify potential reasons to agree or disagree with each other.

Assign a unique ID to each identified linkage between beliefs (e.g., Belief A as a reason to support Belief B and Belief D as a potential reason to oppose F).

Use user-generated arguments to calculate the strength of reasons to agree and disagree with each identified linkage.

Calculate the Evidence to Conclusion Linkage Score (ECLS) using the formula: ECLS (A, B) = (Σ strength of reasons to agree with the linkage between A and B) / (Σ total strength of arguments, agree and disagree).

Store the ECLS for each linkage in a table.

Multiply each argument’s score by its corresponding ECLS to determine its contribution to the total conclusion score.

Sum the scores for all arguments supporting or opposing the conclusion to determine the conclusion score.

Given a list of beliefs, allow them to be tagged as potential reasons to agree or disagree with each other. 

Assign a unique ID to each possible linkage.

For each potential linkage, users can post reasons to support or oppose the linkage. These arguments will either be tagged as a “strengthener” or “weakener” of the conclusion.

Use the performance of these pro/con reasons to calculate the linkage strength. Specifically, calculate the strength of reasons to agree with the linkage, the strength of reasons to disagree with the linkage, and then the total strength of both arguments.

Calculate the Evidence to Conclusion Linkage Score (ECLS) using the formula: ECLS (A, B) = (Σ strength of reasons to agree with the linkage between A and B) / (Σ total strength of arguments, agree and disagree). Store the ECLS in a table for each argument.

Each belief score will eventually be multiplied by its Belief to Conclusion Linkage Score.

The conclusion score will be the product of each supporting and weakening belief score, multiplied by their individual scores as valid reasons to support the conclusion.

The linkage scores are essential because the arguments may be true. Still, they may not necessarily support the conclusion, even if they were true. For example, if someone posts the belief that the grass is green as a reason to support a conclusion, it needs to have a lower linkage score and a high truth score. On the other hand, if someone posts global warming as a reason to support a carbon tax, and the carbon tax is the best way to reduce global warming, this should strengthen the argument to conclusion linkage score.

Scientific Studies

Ranking Scientific Studies Supporting or Weakening Political Beliefs

Introduction: A sorted list of the highest-ranked scientific studies that support or oppose political beliefs can help people make informed decisions based on credible evidence. This can be achieved by considering various factors that determine the quality and reliability of a scientific study.

Methodology: The following factors will be used to determine the rank of scientific studies:

Type of study: Blind and double-blind studies are given more weight since they reduce the chances of bias and increase the study’s reliability.

Number of participants: Studies with larger sample sizes are given higher ranks since they provide more statistically significant results.

Repetition of results: Studies with consistent findings across multiple independent studies are given higher ranks since they confirm the study’s validity and reliability.

Citations and Google Scholar Rank: Studies with a high number of citations and a high Google Scholar rank are given higher ranks since they are widely recognized and respected within the scientific community.

Score of arguments: Studies with a higher score of arguments supporting the belief that the study supports are given higher ranks since they have more credible evidence supporting their conclusions.

To rank the quality of a scientific study that weakens the conclusion, the following factors will be used:

Score: The total idea score, which is calculated by adding the number of reasons to agree with the study’s conclusion (A), subtracting the number of reasons to disagree with the study’s conclusion (D), adding the number of reasons to agree with reasons to agree (AA), subtracting the number of reasons to agree with reasons to disagree (AD), subtracting the number of reasons to disagree with reasons to agree (DA), and adding the number of reasons to disagree with reasons to disagree (DD).

Other factors: Other factors such as the credibility of the source and the study’s methodology will also be considered when ranking the quality of a scientific study that weakens the conclusion.

Conclusion: A sorted list of the highest-ranked scientific studies that support and oppose political beliefs can help people make informed decisions based on credible evidence. By considering various factors that determine the quality and reliability of a scientific study, we can ensure that the rankings are fair and unbiased.