---
_id: '2027'
abstract:
- lang: eng
  text: We present a general framework for applying machine-learning algorithms to
    the verification of Markov decision processes (MDPs). The primary goal of these
    techniques is to improve performance by avoiding an exhaustive exploration of
    the state space. Our framework focuses on probabilistic reachability, which is
    a core property for verification, and is illustrated through two distinct instantiations.
    The first assumes that full knowledge of the MDP is available, and performs a
    heuristic-driven partial exploration of the model, yielding precise lower and
    upper bounds on the required probability. The second tackles the case where we
    may only sample the MDP, and yields probabilistic guarantees, again in terms of
    both the lower and upper bounds, which provides efficient stopping criteria for
    the approximation. The latter is the first extension of statistical model checking
    for unbounded properties inMDPs. In contrast with other related techniques, our
    approach is not restricted to time-bounded (finite-horizon) or discounted properties,
    nor does it assume any particular properties of the MDP. We also show how our
    methods extend to LTL objectives. We present experimental results showing the
    performance of our framework on several examples.
acknowledgement: This research was funded in part by the European Research Council
  (ERC) under grant agreement 246967 (VERIWARE), by the EU FP7 project HIERATIC, by
  the Czech Science Foundation grant No P202/12/P612, by EPSRC project EP/K038575/1.
alternative_title:
- LNCS
author:
- first_name: Tomáš
  full_name: Brázdil, Tomáš
  last_name: Brázdil
- first_name: Krishnendu
  full_name: Chatterjee, Krishnendu
  id: 2E5DCA20-F248-11E8-B48F-1D18A9856A87
  last_name: Chatterjee
  orcid: 0000-0002-4561-241X
- first_name: Martin
  full_name: Chmelik, Martin
  id: 3624234E-F248-11E8-B48F-1D18A9856A87
  last_name: Chmelik
- first_name: Vojtěch
  full_name: Forejt, Vojtěch
  last_name: Forejt
- first_name: Jan
  full_name: Kretinsky, Jan
  id: 44CEF464-F248-11E8-B48F-1D18A9856A87
  last_name: Kretinsky
  orcid: 0000-0002-8122-2881
- first_name: Marta
  full_name: Kwiatkowska, Marta
  last_name: Kwiatkowska
- first_name: David
  full_name: Parker, David
  last_name: Parker
- first_name: Mateusz
  full_name: Ujma, Mateusz
  last_name: Ujma
citation:
  ama: 'Brázdil T, Chatterjee K, Chmelik M, et al. Verification of markov decision
    processes using learning algorithms. In: Cassez F, Raskin J-F, eds. <i> Lecture
    Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence
    and Lecture Notes in Bioinformatics)</i>. Vol 8837. Society of Industrial and
    Applied Mathematics; 2014:98-114. doi:<a href="https://doi.org/10.1007/978-3-319-11936-6_8">10.1007/978-3-319-11936-6_8</a>'
  apa: 'Brázdil, T., Chatterjee, K., Chmelik, M., Forejt, V., Kretinsky, J., Kwiatkowska,
    M., … Ujma, M. (2014). Verification of markov decision processes using learning
    algorithms. In F. Cassez &#38; J.-F. Raskin (Eds.), <i> Lecture Notes in Computer
    Science (including subseries Lecture Notes in Artificial Intelligence and Lecture
    Notes in Bioinformatics)</i> (Vol. 8837, pp. 98–114). Sydney, Australia: Society
    of Industrial and Applied Mathematics. <a href="https://doi.org/10.1007/978-3-319-11936-6_8">https://doi.org/10.1007/978-3-319-11936-6_8</a>'
  chicago: Brázdil, Tomáš, Krishnendu Chatterjee, Martin Chmelik, Vojtěch Forejt,
    Jan Kretinsky, Marta Kwiatkowska, David Parker, and Mateusz Ujma. “Verification
    of Markov Decision Processes Using Learning Algorithms.” In <i> Lecture Notes
    in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence
    and Lecture Notes in Bioinformatics)</i>, edited by Franck Cassez and Jean-François
    Raskin, 8837:98–114. Society of Industrial and Applied Mathematics, 2014. <a href="https://doi.org/10.1007/978-3-319-11936-6_8">https://doi.org/10.1007/978-3-319-11936-6_8</a>.
  ieee: T. Brázdil <i>et al.</i>, “Verification of markov decision processes using
    learning algorithms,” in <i> Lecture Notes in Computer Science (including subseries
    Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)</i>,
    Sydney, Australia, 2014, vol. 8837, pp. 98–114.
  ista: 'Brázdil T, Chatterjee K, Chmelik M, Forejt V, Kretinsky J, Kwiatkowska M,
    Parker D, Ujma M. 2014. Verification of markov decision processes using learning
    algorithms.  Lecture Notes in Computer Science (including subseries Lecture Notes
    in Artificial Intelligence and Lecture Notes in Bioinformatics). ALENEX: Algorithm
    Engineering and Experiments, LNCS, vol. 8837, 98–114.'
  mla: Brázdil, Tomáš, et al. “Verification of Markov Decision Processes Using Learning
    Algorithms.” <i> Lecture Notes in Computer Science (Including Subseries Lecture
    Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)</i>, edited
    by Franck Cassez and Jean-François Raskin, vol. 8837, Society of Industrial and
    Applied Mathematics, 2014, pp. 98–114, doi:<a href="https://doi.org/10.1007/978-3-319-11936-6_8">10.1007/978-3-319-11936-6_8</a>.
  short: T. Brázdil, K. Chatterjee, M. Chmelik, V. Forejt, J. Kretinsky, M. Kwiatkowska,
    D. Parker, M. Ujma, in:, F. Cassez, J.-F. Raskin (Eds.),  Lecture Notes in Computer
    Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture
    Notes in Bioinformatics), Society of Industrial and Applied Mathematics, 2014,
    pp. 98–114.
conference:
  end_date: 2014-11-07
  location: Sydney, Australia
  name: 'ALENEX: Algorithm Engineering and Experiments'
  start_date: 2014-11-03
date_created: 2018-12-11T11:55:17Z
date_published: 2014-11-01T00:00:00Z
date_updated: 2021-01-12T06:54:49Z
day: '01'
department:
- _id: KrCh
- _id: ToHe
doi: 10.1007/978-3-319-11936-6_8
ec_funded: 1
editor:
- first_name: Franck
  full_name: Cassez, Franck
  last_name: Cassez
- first_name: Jean-François
  full_name: Raskin, Jean-François
  last_name: Raskin
intvolume: '      8837'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: http://arxiv.org/abs/1402.2967
month: '11'
oa: 1
oa_version: Submitted Version
page: 98 - 114
project:
- _id: 25EE3708-B435-11E9-9278-68D0E5697425
  call_identifier: FP7
  grant_number: '267989'
  name: Quantitative Reactive Modeling
- _id: 26241A12-B435-11E9-9278-68D0E5697425
  grant_number: '24696'
  name: LIGHT-REGULATED LIGAND TRAPS FOR SPATIO-TEMPORAL INHIBITION OF CELL SIGNALING
- _id: 2581B60A-B435-11E9-9278-68D0E5697425
  call_identifier: FP7
  grant_number: '279307'
  name: 'Quantitative Graph Games: Theory and Applications'
- _id: 25F5A88A-B435-11E9-9278-68D0E5697425
  call_identifier: FWF
  grant_number: S11402-N23
  name: Moderne Concurrency Paradigms
- _id: 25863FF4-B435-11E9-9278-68D0E5697425
  call_identifier: FWF
  grant_number: S11407
  name: Game Theory
- _id: 2584A770-B435-11E9-9278-68D0E5697425
  call_identifier: FWF
  grant_number: P 23499-N23
  name: Modern Graph Algorithmic Techniques in Formal Verification
- _id: 2587B514-B435-11E9-9278-68D0E5697425
  name: Microsoft Research Faculty Fellowship
publication: ' Lecture Notes in Computer Science (including subseries Lecture Notes
  in Artificial Intelligence and Lecture Notes in Bioinformatics)'
publication_status: published
publisher: Society of Industrial and Applied Mathematics
publist_id: '5046'
quality_controlled: '1'
status: public
title: Verification of markov decision processes using learning algorithms
type: conference
user_id: 4435EBFC-F248-11E8-B48F-1D18A9856A87
volume: 8837
year: '2014'
...