Marilyn Monroe famously sang that diamonds are “a girl’s best friend.” But most people don’t want pressurized carbon that comes at the cost of human life, so-called blood diamonds or conflict diamonds. To address these concerns, jewelers provide their customers with ethical certificates regarding the provenance of their jewelry.
AI providers are in a similar position. As machine learning, large language model The origin of the data used to train these AI partners and how it is used is critical to organizations deploying these technologies.
Wild-harvested data that flagrantly violates copyright and intellectual property laws is increasingly frowned upon. Broader ethical concerns about how these models are operated and the data utilized are also becoming legal and regulatory issues. Liability concerns are growing.
Companies offering AI products are now providing their customers with detailed reports (ethical scorecards) that show where the data used to train their models came from, how it was processed, and how it was used. These scorecards help organizations build trust with their customers, who can then more confidently present their products to end users.
InformationWeek spoke with Cindi Howson, chief data and AI strategy officer at ThoughtSpot, and Jamie Hutton, co-founder and chief technology officer at Quantexa, about how ethical AI scorecards can provide companies with the transparency they need to choose the right products, while giving end users assurance that they’re receiving properly sourced information.
legal requirements
The data used to train AI models is subject to a patchwork of inconsistently enforced regulations. EU AI law It is the only comprehensive set of laws regulating the use of data by AI platforms and, like other European technology regulations, could serve as a template for other jurisdictions. This overlaps with the obligations of other major legislation passed in the EU. GDPR.
The Ethics Scorecard utilizes not only the framework set out in this Act, but also non-binding frameworks such as those issued by the Act. Organization for Economic Co-operation and Development — Report data sources and usage to users and regulators in a way that is easy to understand. Various standards developed by ethicists and published in academic journals can also be used.
While these scorecards generally serve as indicators of ethical behavior, they are also compliance documents that demonstrate a company’s adherence to rules regarding data sourcing, privacy, fairness, and accountability.
More and more people are expecting widespread AI legislation. necessary compensation For users. AI providers such as Anthropic have already nailed a narrower scope of copyright infringement. Other regulatory bodies also monitor data used in AI.
“The FDA regulates health care and medical devices,” Howson said. “We have a framework for that, but we haven’t gotten to the specifics.”
In finance, details matter. Howson pointed out that zip codes, for example, cannot be used in credit decisions because they can serve as a proxy for race, a form of discrimination known as redlining.
“Having an explainable and transparent model isn’t just good practice; it’s a requirement,” Hutton says. “Regulators want to make sure that models are not biased, meaning they are not targeting specific age groups, ethnic backgrounds, races or genders.”
Both vendors and users are at risk if an AI model violates these regulations because the creator of the AI model did not properly consider these regulations. Given the wide geographical applicability of many models, a generalized approach is recommended, paying attention to industry-specific and local laws. Scorecards therefore help organizations market their products to clients operating under these constraints and serve as a means to negotiate terms of service.
However, market volatility complicates the use of scorecards. Hutton pointed out that not everyone wants the most securely zipped product. “If you’re heavily regulated in Region A, but not in Region B, you’re going to have a competitive advantage challenge,” he said. “This is an issue that every government is trying to address right now.”
Creating an ethics scorecard
Ethics scorecards are complex documents and are very industry and individual customer specific. These highlight the relevant ethical elements included in the model cards compiled during model creation.
“That documentation will include what data it was trained on, what approach was taken, and the justification that the feature is fair,” Hutton said. “It comes down to a huge document that explains everything about the functionality of the model itself.”
Ethics scorecards extract information about the origin and organization of data, the explainability of how data is deployed, model limitations, potential biases, protection of privacy rights, and the ability for human intervention. Next, document the relationship between these issues and compliance.
However, the scoring process is also complex. Standardization and objective metrics for scoring these factors have not yet been widely implemented. And while some machine learning applications have relatively easy access to this information, other components of LLM and agent AI are more subtle. They behave in ways that even their creators don’t fully understand, making it difficult to score them accurately.
“We’re just becoming more of a black box than ever before,” Hutton warned, referring to advanced AI systems. “What does that mean for explainability? I don’t have a good answer for that yet, but I think it’s going to be a trend that everyone needs to understand.” Howson also sounded the alarm for LLMs. “Initially, the LLM was just testing accuracy,” she said. The main evaluation criterion was how many correct responses they could generate. A focus on performance often came at the expense of transparency and ethical considerations.
“Most of the time, LLMs are not transparent. We don’t know the full extent of the data on which the GPT models were trained,” he said, stressing the need for companies to adopt “ethics by design,” a practice that embeds ethical principles such as transparency, accountability, and fairness into the development process from the beginning.
Benchmarks such as Stanford University’s Comprehensive Assessment of Language Models provide guidance on safety and bias scoring, and can provide value to organizations and clients that rely on these qualities to secure their reputations.
In the meantime, even a crudely crafted ethics scorecard can be an asset to vendors and organizations alike as they navigate AI implementation and its consequences.
Ethics scorecard for AI systems: Evaluation criteria
scoring system
Poor performance: Significant improvement is required.
Below average: Although some criteria have been met, large gaps remain.
average: Meets minimum ethical standards.
good: It goes beyond basic ethical requirements in most areas.
wonderful: Fully compliant with ethical principles and best practices.
Instructions for use
Rate each category Answer key questions and assign a score from 1 to 5.
Please enter your comment Explain the rationale behind each score or highlight areas for improvement.
use scorecard Identify strengths and weaknesses of AI systems and prioritize ethical improvements.
sauce: The sample scorecard template was generated by Informa TechTarget’s in-house large-scale language model based on established ethical AI guidelines and frameworks from sources such as the European Commission’s Ethics Guidelines for Trustworthy AI, the IEEE Global Initiative on Ethics for Autonomous and Intelligent Systems, and Stanford University’s Comprehensive Assessment of Language Models.

