WIDE Technical-Report in 2012


                Cybersecurity Information Discovery Mechanism
               wide-tr-TakeshiTakahashi-SecInfoDiscovery-00.txt


                     WIDE Project: http://www.wide.ad.jp/
                 If you have any comments on WIDE documents,
                     please contact to board@wide.ad.jp.

Title:Cybersecurity Information Discovery Mechanism
Author(s): Takeshi Takahashi and Youki Kadobayashi
Date: 2012-12-15


CYBEX Working Group                                         T. Takahashi
                                                                    NICT
                                                          Y. Kadobayashi
                                                                   NAIST


             Cybersecurity Information Discovery Mechanism
            wide-tr-TakeshiTakahashi-SecInfoDiscovery-00.txt


Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . . . 3
   2.  Proposed Mechanism  . . . . . . . . . . . . . . . . . . . . . . 3
     2.1.  Roles . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
     2.2.  Information Structure . . . . . . . . . . . . . . . . . . . 3
     2.3.  Protocol  . . . . . . . . . . . . . . . . . . . . . . . . . 4
   3.  Prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
     3.1.  Implementation  . . . . . . . . . . . . . . . . . . . . . . 5
     3.2.  Demonstration . . . . . . . . . . . . . . . . . . . . . . . 5
   4.  Discussion and Future Works . . . . . . . . . . . . . . . . . . 6
   5.  Copyright Notice  . . . . . . . . . . . . . . . . . . . . . . . 6
   6.  Normative References  . . . . . . . . . . . . . . . . . . . . . 6
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . . . 7


Takahashi & Kadobayashi                                         [Page 2]

wide-tr     Cybersecurity Information Discovery          Oct 2012


1.  Introduction

   To cope with increasing amount of cyber threats, organizations need
   to share cybersecurity information beyond the borders of
   organizations, countries, and even languages.  Assorted organizations
   built repositories that store and provide XML-based cybersecurity
   information on the Internet.  Among them are NVD, OSVDB, and JVN, and
   more cybersecurity information from various organizations from
   various countries will be available in the Internet.  However, users
   are unaware of all of them.  To advance information sharing, users
   need to be aware of them and be capable of identifying and locating
   cybersecurity information across such repositories by the parties who
   need that, and then obtaining the information over networks.  This
   paper proposes a discovery mechanism, which identifies and locates
   sources and types of cybersecurity information and exchanges the
   information over networks.  The mechanism uses the ontology of
   cybersecurity information [Takahashi_SINCONF2010] to incorporate
   assorted format of such information so that it can maintain future
   extensibility.  It generates RDF-based metadata from XML-based
   cybersecurity information through the use of XSLT.  This paper also
   introduces an implementation of the proposed mechanism and discusses
   extensibility and usability of the proposed mechanism.


2.  Proposed Mechanism

2.1.  Roles

   The proposed mechanism introduces four distinct roles.

   Discovery Client retrieves cybersecurity information by communicating
   with one or more arbitrary Discovery Servers.

   Discovery Server provides assistances to find proper Information
   Source to Discovery Clients by communicating with multiple
   Registries, aggregating information from them, and then delivering
   that to the Discovery Client.

   Registry manages an internal registry that contains the metadata of
   Information Sources by communicating with them.

   Information Source provides cybersecurity information that is
   described in XML format by communicating with Registries.

2.2.  Information Structure

   A Registry uses an RDF-based internal repository to maintain the
   metadata list of cybersecurity information residing in Information


Takahashi & Kadobayashi                                         [Page 3]

wide-tr     Cybersecurity Information Discovery          Oct 2012


   Sources.  The metadata is generated by accessing Information Sources
   and extracting needed information from them, as described in Section
   Section 2.3.  Note that the level of details of the metadata depends
   on implementation, but URI that can uniquely identify an Information
   Source is needed The repository uses the information structure
   described in Table \ref{Tabl:MajorCybersecurityInformationStandards},
   which separates information category and content description format.
   The ontology of cybersecurity operational information proposed in
   [Takahashi_SINCONF2010] is used for the category, whereas various
   industry specifications are used for the content description format,
   so that it can maintain future extensibility and compatibility with
   future such specifications.

2.3.  Protocol

   Information Publishing is a procedure for an Information Source to
   publish its XML-based cybersecurity information.  An Information
   Source sends registration message, which contains the information's
   URI, category, and allowed access method (e.g., http), to a Registry.
   The Registry then accesses to the URI by using one of the methods,
   receives the information, and converts it into RDF-based metadata by
   running XSLT.  It then generates and sends Notification message to
   its Discovery Servers, which may also send the message to Discovery
   Clients so that they can receive any security information updates as
   soon as possible.

   Server Registration and cancellation are procedures for a Discovery
   Client to use a Discovery Server.  A Discovery Client sends join
   message to a Discovery Server it wants to use.  The Discovery Server
   then sends result message with the category and supporting format
   information.  Though this paper proposes a single category following
   the ontology proposed in [Takahashi_SINCONF2010], the procedure
   allows to use different categories by embedding different category
   information in the Result message, so that the proposed mechanism can
   provide future extensibility.  When the Discovery Client wishes to
   stop using the server, the client may send leave message to the
   server.

   Information Retrieval is a procedure for a Discovery Client to
   retrieve and obtain cybersecurity information.  A Discovery Client
   sends query message to a Discovery Server, which forwards the message
   to all of the Registries it communicates with.  Each of them then
   retrieves its internal repository and creates and sends a Result
   message.  The Discovery Server receives the messages from all of the
   Registries, aggregates them into one, ranks and reorders the
   candidate Information Sources, and then embeds the information into a
   new Result message, which is sent back to the Discovery Client.  The
   Discovery Client chooses one Information Source among the candidate


Takahashi & Kadobayashi                                         [Page 4]

wide-tr     Cybersecurity Information Discovery          Oct 2012


   Information Sources that is listed in the message.  Then it accesses
   to the Information Source's URI using the allowed access method and
   obtains the XML information stored inside the Information Source.


3.  Prototype

3.1.  Implementation

   A prototype of the proposed mechanism is implemented with Java on
   Linux CentOS.  It uses a certificate provided by Jetty to certify the
   Information Source.  Its Registry simply converts all the tags of the
   Information Sources' XML information into RDF-based metadata by using
   XSLT, though meticulous metadata extraction mechanism could be
   implemented, if needed.  Sesame, an implementation of SPARQL engine,
   is also used.  The proposed mechanism allows Information Sources to
   support arbitrary transport protocol for accessing itself, but this
   implementation supports only HTTP, HTTPS, and WebSocket.  During the
   retrieval procedure, the Registry needs to rank candidates of
   Information Sources.  Though the ranking algorithm is outside the
   scope of the paper, the implementation adopted a simple algorithm as
   follows.  The algorithm counts the number of keywords available in a
   tag, and then divide the number by the total number of the words in
   the tag.  Then it assigns high rank on the entry that has higher
   resultant value.  If the same value could be found, the one with
   older registration date gets higher rank.

3.2.  Demonstration

   We set up a demonstration, where an Information Source publishes its
   cybersecurity information.  This demonstration is conducted over a
   network consisting of 1 Discovery Client, 3 Discovery Servers, 15
   Registries, and 30 Information Sources, all of which are running over
   different virtual machines.

   The demonstration has a network view, which describes network
   topology and communication status within it.

   The demonstration also has a search view of Discovery Clients.  It
   provides category-based search, keyword search, and security
   information update.

   The keyword search is in the bottom part of the view.  Users can
   enter arbitrary keyword in the bottom text box and run search by
   clicking on the "Search" button.  Users may enjoy more sophisticated
   searches by clicking on the "Advance Search" button in the view and
   moving to the advanced search view, where they may specify the target
   tags of the retrieval.  When specifying the tags, the users may


Takahashi & Kadobayashi                                         [Page 5]

wide-tr     Cybersecurity Information Discovery          Oct 2012


   lookup the available tags by clicking on the "Select category"
   button.  The Discovery Client can provide the category information
   since it went through the server registration procedure as described
   in Section Section 2.3, where it received the information from its
   Discovery Server.  Users can simply select the tag, then identify the
   keyword in the advanced search.


4.  Discussion and Future Works

   The proposed mechanism incorporates various formats defined by
   assorted industry specifications, which are yet to be developed
   further.  Its metadata structure is designed so that it can maintain
   extensibility.  In case current information format becomes obsoleted,
   any new specification could be introduced as a means to describe
   information of the types defined by the ontology.  In this way, the
   changes are kept minimal.  Even more, the ontology itself could be
   extended though the ontology is designed so that these won't happen
   in the near future.

   In addition to the extensibility, this mechanism needs to be scalable
   to accommodate large volume of cybersecurity information.  This
   evaluation must be done as our future work.

   The proposed mechanism enables users to search cybersecurity
   information across assorted repositories including NVD and JVN.

   The current implementation is, however, run on the Intranet that is
   isolated from the Internet since it does not consider security of the
   system.  For instance, it may suffer from impersonation or man-in-
   the-middle attacks, which may cause severe security incidents.
   Though this paper excluded the security issue from this paper, our
   future work considers this issue and integrates with assorted
   security techniques to reinforce the mechanism's security level.

   Further information with reader-friendly diagrams are available in
   [Takahashi_ACSAC2012_Poster].


5.  Copyright Notice

   Copyright (C) Takeshi Takahashi (2012).  All Rights Reserved.


6.  Normative References

   [Takahashi_SINCONF2010]
              Takahashi, T., Kadobayashi, Y., and H. Fujiwara,


Takahashi & Kadobayashi  Expires April 18, 2013                 [Page 6]

wide-tr     Cybersecurity Information Discovery          Oct 2012


              "Ontological Approach toward Cybersecurity in Cloud
              Computing", International Conference on Security of
              Information and Networks SIN, September 2010.

   [Takahashi_ACSAC2012_Poster]
              Takahashi, T., Kadobayashi, Y., and Y. Takano, "Linking
              Cybersecurity Knowledge: Cybersecurity Information
              Discovery Mechanism", ACSAC Poster Session, November 2012.


Authors' Addresses

   Takeshi Takahashi
   National Institute of Information and Communications Technology
   4-2-1 Nukui-Kitamachi Koganei
   184-8795 Tokyo
   Japan

   Phone: +80 423 27 5862
   Email: takeshi_takahashi@nict.go.jp


   Youki Kadobayashi
   Nara Institute of Science and Technology
   8916-5 Takayama, Ikoma
   630-0192 Nara
   Japan

   Email: youki-k@is.aist-nara.ac.jp


Takahashi & Kadobayashi                                         [Page 7]