Task Description
Knowledge Base Question Answering (KBQA) is a popular task in the field of Natural Language Processing and Information Retrieval, in which the goal is to answer a natural language question using the facts in a Knowledge Base. KBQA can involve several subtasks such as entity linking, relation linking, and answer type prediction. In the SMART 2021 Semantic Web Challenge, we focus on two subtasks in KBQA.
On one hand, the questions can be generally classified based on Wh-terms (Who, What, When, Where, Which, Whom, Whose, Why). A granular answer type classification is possible with popular Semantic Web ontologies such as DBpedia (~760 classes) and Wikidata (~50K classes). On the other hand, relation prediction for question is a hard task, some relations are semantically far and sometimes tokens deciding the relations are spread across the question, some relations are implicit in text, and there are lexical gaps in relation surface forms and KG property labels.
Thus, in the second iteration of SMART challenge, we have two independent tasks:
Task 1 - Answer type prediction: Given a question in natural language, the task is to predict type of the answer using a set of candidates from a target ontology.
Task 2 - Relation set prediction: Given a question in natural language, the task is to predict relation to used for identifying the correct answer.
The proceedings of the first edition can be found at SMART 2020 - CEUR Proceedings.Example for tasks
Question | Answer Type | |
---|---|---|
DBpedia | Wikidata | |
Which languages were influenced by Perl? | dbo:ProgrammingLanguage | wd:Q9143 |
Give me all actors starring in movies directed by and starring William Shatner. | dbo:Actor | wd:Q33999 |
How many employees does IBM have? | number | number |
Question | Answer Type | |
---|---|---|
DBpedia | Wikidata | |
Which languages were influenced by Perl? | dbo:influencedBy | wdt:P737 |
Give me all actors starring in movies directed by and starring William Shatner. | dbo:starring, dbo:director | wdt:P161, wdt:P57 |
How many employees does IBM have? | dbo:numberOfEmployees | wdt:P1128 |
Dataset
Task 1: Answer Type Prediction (SMART2021-AT)
We provide two datasets for this answer prediction task, one using the DBpedia ontology and the other using the Wikidata ontology. Both follow the structure shown below.
Each question will have a (a) question id, (b) question text in natural language, (c) an answer category ("resource"/"literal"/"boolean"), and (d) answer type.
If the category is "resource", answer types are ontology classes from either the DBpedia ontology or the Wikidata ontology. If category is "literal", answer types are either "number", "date", or "string". "boolean" answer type. If the category is "boolean", answer type is always "boolean".
[
{
"id": "dbpedia_1",
"question": "Who are the gymnasts coached by Amanda Reddin?",
"category": "resource",
"type": ["dbo:Gymnast", "dbo:Athlete", "dbo:Person", "dbo:Agent"]
},
{
"id": "dbpedia_2",
"question": "How many superpowers does wonder woman have?",
"category": "literal",
"type": ["number"]
}
{
"id": "dbpedia_3",
"question": "When did Margaret Mead marry Gregory Bateson?",
"category": "literal",
"type": ["date"]
},
{
"id": "dbpedia_4",
"question": "Is Azerbaijan a member of European Go Federation?",
"category": "boolean",
"type": ["boolean"]
}
]
Task 2: Relation Prediction (SMART2021-RL)
Similar to AT, we provide two datasets for the relation prediction task, one using the DBpedia ontology and the other using the Wikidata ontology. Both follow the structure shown below.
Each question will have a (a) question id, (b) question text in natural language, (c) a list of relation blocks (n blocks for multihop questions involving n relations). Each relation block can have multiple interchangeable relations.
We also provide a restricted vocabulary of all the relations that are used in the dataset (both training and test sets).
[
{
"id": "smart-2021-rl-dbpedia-0"
"question": "Where was richard sprigg steuart born?",
"relations": [
["dbo:birthPlace"]
]
}, {
"id": "smart-2021-rl-dbpedia-1",
"question": "What are the awards won by the producer of Puss in Boots (film)?",
"relations": [
["dbo:producer"],
["dbo:award"]
]
},{
"id": "smart-2021-rl-dbpedia-2"
"question": "Name a college in california",
"relations": [
["dbo:locatedInArea", "dbo:city", "dbo:location", "dbo:region", "dbo:state"]
]
}
]
Evaluation Metrics
Task 1: Answer Type Prediction
For each natural language question in the test set, the participating systems are expected to provide two predictions: answer category and answer type. Answer category can be either "resource", "literal" or "boolean". The format is as the same as the training data.
If answer category is "resource", the answer type should be an ontology class (DBpedia or Wikidata, depending on the dataset). The systems could predict a ranked list of classes from the corresponding ontology. If answer category is "literal", the answer type can be either "number", "date" or "string".
Category predication will be considered as a multi-class classification problem and accuracy score will be used as the metric. For type predication, we will use the metric lenient NDCG@k with a Linear decay from the paper from Balog and Neumayer.
The evaluation scripts can be found here.
Task 2: Relation Prediction
For each question, the participating systems are supposed to provide a list of relations. The format of the expected system output is as follows:
[
{
"id": "smart-2021-rl-dbpedia-test-0",
"relations": ["dbo:location"]
},
{
"id": "smart-2021-rl-wikidata-test-1",
"relations": ["dbo:award", "dbo:director"]
}
]
[
{
"id": "smart-2021-rl-wikidata-test-0",
"relations": ["P4599"]
},
{
"id": "smart-2021-rl-wikidata-test-1",
"relations": ["P166", "P805"]
}
]
The list of system predicted relations will be compared with the gold list of relations and precision, recall, and F1 metrics will be calculated for all questions. Micro F1 will be used to rank the systems.
The evaluation scripts can be found here.
Submission Details
Participants are requested to submit the system output for the test data. The format is as same as the training data. In addition, the participants are requested to submit a system description that will be included in a joint ISWC challenge proceedings volume in CEUR. System descriptions must be in English either in PDF or HTML, formatted in the style of LNCS, and no longer than 12 pages. Submissions can be sent via EasyChair. There is a slack workspace for the challenge related discussions, please send an email for an invite. The accepted systems will get the opportunity to show their results during the ISWC 2021 poster and demo session.
Important Dates
Date | Description |
---|---|
15th July 2021 | Release of the training set. |
14th of September 2021 | Release of the test set. |
4th of October 2021 | Submission of system output. |
8th of October 2021 | Publication of results. System description submission. |
24-28th October 2021 | ISWC Conference, Visit iswc2021.semanticweb.org |
8th of November 2021 | Camera-ready submission. |