Saltar al contenido

CECILIA-10C-900-NER

images

CECILIA-10C-900-NER Dataset,

This dataset contains 880 cyber incidents from the CECILIA-10C-900 dataset, annotated using Named Entity Recognition to identify entities, their respective labels and the entities’ locations in the cyber incident text. 

There are 18 possible entity labels:

1. Person: names of persons
2. Norp: nationalities, religious or political groups
3. Facilities: buildings, airports, motorways, bridges, etc
4. GPE_Location: countries, cities, states, etc
5. Location: mountains, rock formations, seas, rivers, etc
6. Organization: organizations, companies, institutions, etc
7. Date: absolute dates, relative dates, time periods.
8. Time: any period of time less than one day
9. Money: any monetary value, together with its unit e.g. 500€, $1000, etc
10. Group: groups which are not nationalities, religious groups or political groups, such as hacker groups
11. Quantities: measurements, such as weight, distance, etc
12. Numbers_O: first, second, third, etc
13. Numbers_C: other numbers which are not dates, times, ordinal numbers, or measurements of quantities, etc
14. URL
15. Email
16. Phone
17. Address
18. IP

This dataset is available upon request. Please, send us an email to the address gvis@unileon.es with the following data:
  • Name of the Institution you are working on.
  • Brief description of the project in which the dataset will be used.
  • Objective of the specific research you want the dataset for.
Bear in mind that the request must be done from an email account from your institution. If you are a student, the request must be done by your supervisor in the mentioned Institution.