1:- module( bims, [ 2 bims/0, bims/1, 3 bims_version/2, bims_citation/2, 4 dlp_load/1, dlp_load/2, 5 dlp_sample/1,dlp_sample/3, 6 dlp_call/1,dlp_call/3, 7 dlp_call_sum/2, 8 dlp_seed/0, dlp_path_prob/2, dlp_path_prob/3 9 ] ). 10 11:- use_module(library(lib)). 12% also depends on pack(stoics_lib), pack(lib) will ask to download it if missing 13 14:- ensure_loaded( '../src/init_lib' ). 15:- lib(source(bims), homonyms(true)). 16 17% :- lib(stoics_lib). % we should consider loading the whole of stoics_lib, as we using many of its preds 18% 19:- lib(stoics_lib:en_list/2). 20:- lib(stoics_lib:goal_spec/2). 21:- lib(stoics_lib:portray_clauses/2). 22 23:- lib(mcmcms/12). 24:- lib(os_unique_by_date/2). 25:- lib(n_digits/3). 26:- lib(clean_module/1). 27:- lib(report_triggers/1). 28:- lib(bims_bb_remove/2). 29:- lib(remove_template_duplicates/2). 30:- lib(get_date_time/1). 31:- lib(ord_only_add_elem/3). % needed on ad_expand 32:- lib(dlp_load/1). 33:- lib(dlp_sample/1). 34:- lib(dlp_call/1). 35:- lib(dlp_path_prob/2). 36:- lib(dlp_seed/0). 37:- lib(end(bims)). 38% :- lib_pack_end( bims ).
178bims_defaults( ArgsPrv, [
179 chains(3),
180 report([]),
181 results_dir(Dir),
182 iterations(100),
183 tempered([]),
184 seeds(1),
185 progress_percentage(10),
186 progress_stub('.'),
187 DefModelT,
188 prior(ModelSingular),
189 likelihood(Model),
190 data(Model),
191 backtrack(uc),
192 top_goal(ModelSingular),
193 debug(true)
194 ] ) :-
195 en_list( ArgsPrv, Args ),
196 ( (memberchk(results_dir(ArgsDir),Args),ground(ArgsDir)) ->
197 make_directory(ArgsDir)
198 ;
199 true
200 ),
201 DefModel = carts,
202 DefModelT = models(DefModel),
203 append( Args, [DefModelT], Partial ),
204 memberchk( models(Model), Partial ),
205 atom_singular( Model, ModelSingular ),
206 bims_args_results_dir( Args, Dir ).
If the argument (File) corresponds to an existing file, then it is taken to be a settings file. Each argument should be a fact correspond to a known option. For example
chains(3). iterations(100). seeds([1,2,3]).
If the argument (Opts) does not correspond to a file is take to be a list of option terms.
The simplest way to use the software is to make a new directory and run some MCMC chains. The default call,
?- bims. % equivalent to ?- bims([]).
runs a 3 chains (R=3, below) 100 iterations (I=100) MCMC simulation. The models learnt are classifications trees (carts) based on the default prior and the data are the BCW dataset. The above call is equivalent to:
?- bims([models(carts)]).
To run a toy BN learning example run
?- bims([models(bns)]).
This runs 3 chains on some synthetic data of the 8-nodal Asia BN.
To get familiar on how to run bims on private data, make a new directory,
create a subdirecory data
and copy file bims(data/asia.pl)
to
data/test_local.pl
.
?- bims([data(test_local)]).
Opts
bns
.debug(bims)
to get debuging messages. If Dbg==false, nodebug(bims)
is called.seeds(1)
when chains(3)
is given expands to seeds([1,2,3])
.top_goal(Top)
) expects. In general the dependency is with the likelihood,
with the prior expected to be compatible with what the likelihood dictates in terms of data.
In the likelihoods provided, Data is the stem of a filename that is loaded in memory.
The file is looked for in Dir/Data[.pl] where Dir is looked for in [./data,bims(Model/data/)].bims(dlps)
.prefix(Pfx)
recognition)all
is expanded to reporting all known reportable terms.All file name based options: Lk, Data, Prior or Rdir, are passed through absolute_file_name/2.
The predicate generates one results directory (Rdir) and files recording information about each run (R) are placed in Rdir.
*/
303bims :- bims( [] ). 304 305bims( ArgsPrv ) :- 306 en_list( ArgsPrv, Args ), 307 bims_defaults( Args, Defs ), 308 append( Args, Defs, Opts ), 309 debug( bims, 'Bims options: ~w', [Opts] ), 310 bims_option_debug( Opts, DbgRestore ), 311 memberchk( results_dir(ResDPrv), Opts ), 312 absolute_file_name( ResDPrv, ResD ), 313 debug( bims, 'Results directory: ~w', ResD ), 314 memberchk( chains(Runs), Opts ), 315 bims_option_seeds( Opts, Seeds ), 316 remove_template_duplicates( Opts, UnqOpts ), 317 number_codes( Runs, Rcodes ), 318 length( Rcodes, RcLen ), 319 n_digits( RcLen, 0, Zero ), 320 atomic_list_concat( [Zero,'opts.pl'], '-', BoptsB ), 321 directory_file_path( ResD, BoptsB, BroF ), 322 bims_write_options( BroF, UnqOpts ), 323 % fixme: likel 324 % likelihood(Lk=Model) 325 % model(Mdl), 326 PersOpts = [ models(Model), prior(Dlp), likelihood(Lkl), 327 data(Data), top_goal(Goal), 328 report(These) 329 ], 330 maplist( list_element(Opts), PersOpts ), 331 report_triggers( These ), 332 bims_ensure_likelihood_loaded( Model, Lkl, AbsLkl ), 333 bims_ensure_data_loaded( Model, Lkl, Data, AbsData ), 334 bims_locate_prior_file( Model, Dlp, AbsDlp ), 335 bims_write_abs_options( BroF, abs(AbsLkl,AbsData,AbsDlp) ), 336 bims_runs( Runs, 1, RcLen, Model, AbsDlp, Seeds, ResD, Goal, Opts ), 337 get_date_time( Now ), 338 portray_clauses( [finished_at(Now)], [file(BroF),mode(append)] ), 339 bims_option_debug_set( DbgRestore ). 340 341bims_runs( 0, _I, _Dgs, _Mdl, _, _Seelds, _ResD, _Goal, _Opts ) :- 342 !, 343 debug( bims, 'Finished all runs', [] ). 344bims_runs( R, I, Dgs, Mdl, Dlp, [Seed|Seeds], ResD, Goal, Opts ) :- 345 copy_term( Goal, TopG ), 346 % * Out, *Stats, Kernel, ModelType, Prior, PB, Repeats, Data, HotChainsIds, *Seed, SGl 347 % Out = chain_output( 01-chain.pl ), Stats= 01_stats.txt 348 debug( bims, 'Run: ~d', I ), 349 n_digits( Dgs, I, Iat ), 350 bims_out_file( ResD, Iat, chain, pl, Chain ), 351 bims_out_file( ResD, Iat, stats, pl, Stats ), 352 bims_out_file( ResD, Iat, report, pl, Rep ), % fixme: nicer name ? 353 MCopts = [ backtrack(BckT), iterations(Its), tempered(Hot), 354 progress_percentage(Pc), progress_stub(Stub) 355 ], 356 maplist( list_element(Opts), MCopts ), %fixme: allow lists 357 bims_progress_reporter( Its, Pc, Stub ), 358 bims_backtrack_term( BckT, Bck, P ), 359 mcmcms( Chain, Stats, Rep, Bck, Mdl, Dlp, P, Its, true, Hot, Seed, TopG ), 360 !, % fixme: do some cleaning 361 J is I + 1, 362 Q is R - 1, 363 bims_runs( Q, J, Dgs, Mdl, Dlp, Seeds, ResD, Goal, Opts ).
date(Y,M,D)
.
?- bims_version(Vers, Date). Vers = 3:0:0, Date = date(2023, 5, 8).
380bims_version( 3:0:0, date(2023,5,8) ).
bibtex(Type,Key,Pairs)
term of the same publication.
On backtracking it produces all publications in reverse chronological order.
?- bims_citation(A, G), write(A), nl. Distributional Logic Programming for Bayesian Knowledge Representation. Nicos Angelopoulos and James Cussens. International Journal of Approximate Reasoning (IJAR). Volume 80, January 2017, pages 52-66.
In total
?- findall( A, bims_citation(A,G), Pubs ), length( Pubs, Length ). Pubs = [...], Length = 5.
*/
409bims_citation( Atom, bibtex(Type,Key,Pairs) ) :- 410 Atom = 'Distributional Logic Programming for Bayesian Knowledge Representation. 411Nicos Angelopoulos and James Cussens. 412International Journal of Approximate Reasoning (IJAR). 413Volume 80, January 2017, pages 52-66.', 414 Type = article, 415 Key = 'AngelopoulosN_CussensJ_2017', 416 Pairs = [ 417 author = 'Nicos Angelopoulos and James Cussens', 418 title = 'Distributional Logic Programming for Bayesian Knowledge', 419 journal= 'International Journal of Approximate Reasoning', 420 year = 2017, 421 month = 'January', 422 volume = 80, 423 pages = '52-66', 424 url = 'http://www.sciencedirect.com/science/article/pii/S0888613X16301232' 425 ]. 426bims_citation( Atom, bibtex(Type,Key,Pairs) ) :- 427 Atom = 'Bayesian learning of Bayesian Networks with informative priors. 428Nicos Angelopoulos and James Cussens (2008). 429Special issue on BN learning. Annals of Mathematics and Artificial Intelligence (AMAI) 54(1-3), 53-98.', 430 Type = article, 431 Key = 'AngelopoulosN_Cussens_2008', 432 Pairs = [ 433 author = 'Nicos Angelopoulos and James Cussens', 434 title = 'Bayesian learning of Bayesian Networks with informative priors.', 435 year = '2008', 436 journal= 'Special issue on BN learning. Annals of Mathematics and Artificial Intelligence (AMAI)', 437 volume = '54', 438 issue = '1-3', 439 pages = '53-98', 440 url = 'http://dx.doi.org/10.1007/s10472-009-9133-x' 441 ]. 442bims_citation( Atom, bibtex(Type,Key,Pairs) ) :- 443 Atom = 'Exploiting Informative Priors for Bayesian Classification and Regression Trees. 444Nicos Angelopoulos and James Cussens (2005) 445In 19th International Joint Conference on Artificial Intelligence (IJCAI-05), 641-646, Edinburgh, UK, August 2005.', 446 Type = inproceedings, 447 Key = 'AngelopoulosN_Cussens_2005', 448 Pairs = [ 449 title = 'Exploiting Informative Priors for Bayesian Classification and Regression Trees', 450 author = 'Nicos Angelopoulos and James Cussens', 451 year = '2005', 452 inproceedings = 'In 19th International Joint Conference on Artificial Intelligence (IJCAI-05)', 453 pages = '641-646', 454 address = 'Edinburgh, UK', 455 month = 'August', 456 year = 2005 457 ]. 458 459bims_citation( Atom, bibtex(Type,Key,Pairs) ) :- 460 Atom = 'Tempering for Bayesian C&RT. 461Angelopoulos, Nicos and Cussens, James (2005) 462In 22nd International Conference on Machine Learning (ICML 2005), 17-24, Bonn, Germany, August 2005.', 463 Type = inproceedings, 464 Key = 'AngelopoulosN_CussensJ_2005a', 465 Pairs = [ 466 title = 'Tempering for {B}ayesian {C&RT}', 467 author = 'Angelopoulos, Nicos and Cussens, James', 468 booktitle= '22nd International Conference on Machine Learning (ICML 2005)', 469 address = 'Bonn, Germany', 470 publisher= 'ACM', 471 month = 'August', 472 pages = '17-24', 473 year = '2005', 474 url = 'ftp://ftp.cs.york.ac.uk/pub/aig/Papers/james.cussens/icml05.pdf' 475 ]. 476 477bims_citation( Atom, bibtex(Type,Key,Pairs) ) :- 478 Atom = 'Markov chain Monte Carlo using tree-based priors on model structure. 479Nicos Angelopoulos and James Cussens (2001). 480In 17th Conference on Uncertainty in Artificial Intelligence (UAI-2001), 16-23, Seattle, USA.', 481 Type = inproceedings, 482 Key = 'AngelopoulosN_Cussens_2001', 483 Pairs = [ 484 title = 'Markov chain Monte Carlo using tree-based priors on model structure.', 485 author = 'Nicos Angelopoulos and James Cussens', 486 year = 2001, 487 proceedings = 'In 17th Conference on Uncertainty in Artificial Intelligence (UAI-2001)', 488 pages = '16-23', 489 address = 'Seattle, USA' 490 ]. 491 492bims_lib( Spec ) :- 493 ( Spec = Fname/_Arity -> true; Spec = Fname ), 494 ensure_loaded( bims(src/lib/Fname) ).
501bims_option_seeds( Opts, Seeds ) :-
502 memberchk( chains(Runs), Opts ),
503 memberchk( seeds(PrvSeedS), Opts ),
504 en_list( PrvSeedS, PrvSeeds ),
505 length( PrvSeeds, PrvLen ),
506 Diff is max( Runs - PrvLen, 0 ),
507 last( PrvSeeds, Last ),
508 findall( A, ( between(1,Diff,I),
509 A is Last + I
510 ), As ),
511 append( PrvSeeds, As, Seeds ).
debug(true)
is in Opts.
Before indicates the state of bims debugging prior to the call.
The values for Before are: false or true.519bims_option_debug( Opts, Bef ) :- 520 memberchk( debug(Dbg), Opts ), 521 debugging( bims, Bef ), 522 !, 523 bims_option_debug_set( Dbg ). 524 525bims_option_debug_set( true ) :- 526 debug( bims ). 527bims_option_debug_set( false) :- 528 nodebug( bims ).
?- bims:bims_out_file( dir, '01', chain, pl, File ). File = 'dir/01-chain.pl'.
538bims_out_file( Dir, Iat, Tkn, Ext, File ) :- 539 atomic_list_concat( [Iat,Tkn], '-', Bstem ), 540 file_name_extension( Bstem, Ext, Bname ), 541 directory_file_path( Dir, Bname, File ). 542 543bims_ensure_data_loaded( Model, Lkl, DataT, DataF ) :- 544 DataT =.. [Data|Args], 545 % generic section 546 clean_module( data ), 547 AbsOpts = [file_type(directory),solutions(all)], 548 findall( BimsDD, absolute_file_name(bims(data),BimsDD,AbsOpts), BimsDDs ), 549 debug( bims, 'Bims data dirs: ~w', [BimsDDs] ), 550 member( Dir, [data|BimsDDs] ), 551 directory_file_path( Dir, Data, Stem ), 552 file_name_extension( Stem, pl, DataF ), 553 debug( bims, 'Testing existance of data file: ~w', DataF ), 554 exists_file( DataF ), 555 debug( bims, 'Loading data file: ~w', DataF ), 556 data:load_files( DataF, [silent(true)] ), 557 % data:ensure_loaded( DataF ), 558 bims_data_call( Model, Lkl, Args, PrepF, PrepG ), 559 assert( data:data_file(DataF) ), 560 assert( data:data_models(Model) ), 561 assert( data:data_prep_file(PrepF) ), 562 assert( data:data_prep_goal(PrepG) ), 563 !. 564bims_ensure_data_loaded( Model, Lkl, Data, _DataF ) :- 565 throw( fixme(cannot_load_data_for(Model,Lkl,Data)) ). 566 567bims_ensure_likelihood_loaded( Model, Lkl, PlLkl ) :- 568 Sub = models/Model/lklhoods/Lkl/Lkl, 569 clean_module( bims_lkl ), 570 absolute_file_name( bims(Sub), LklStem, [solutions(all)] ), 571 file_name_extension( LklStem, pl, PlLkl ), 572 debug( bims, 'Looking for likelihood in file: ~w', PlLkl ), 573 exists_file( PlLkl ), 574 debug( bims, 'Loading likelihood from file: ~w', PlLkl ), 575 % ensure_loaded( bims:PlLkl ), 576 bims_lkl:load_files( PlLkl, [silent(true),if(true)] ), 577 !. 578bims_ensure_likelihood_loaded( Model, Lkl, _ ) :- 579 throw( fixme(cannot_find_likelihood_for(Lkl,Model)) ). 580 581bims_data_call( Model, Lkl, Args, DataMan, GoalCopy ) :- 582 atom_concat( Lkl, '_data', BaseStem ), 583 file_name_extension( BaseStem, pl, BaseName ), 584 Sub = models/Model/lklhoods/Lkl, 585 AbsOpts = [file_type(directory),solutions(all)], 586 absolute_file_name( bims(Sub), LklD, AbsOpts ), 587 directory_file_path( LklD, BaseName, DataMan ), 588 debug( bims, 'Looking data preparation file in: ~w', DataMan ), 589 exists_file( DataMan ), 590 debug( bims, 'Loading data preparation file: ~w', DataMan ), 591 bims_data:load_files( DataMan, [silent(true),if(true)] ), 592 Goal =.. [BaseStem,Args], 593 duplicate_term( Goal, GoalCopy ), 594 call( bims_data:Goal ). 595 596list_element( List, Element ) :- 597 memberchk( Element, List ). 598 599bims_backtrack_term( Term, Bck, P ) :- 600 functor( Term, Bck, Arity ), 601 bims_backtrack_arity_term( Arity, Term, P ). 602 % fixme: add Arity > 1 error 603 604bims_backtrack_arity_term( 0, Atom, Atom ). 605bims_backtrack_arity_term( 1, Term, Arg ) :- 606 arg( 1, Term, Arg ). 607 608bims_locate_prior_file( Model, DlpStem, AbsDlp ) :- 609 AbsOpts = [file_type(directory),solutions(all)], 610 absolute_file_name( bims(models/Model/dlps), Bims, AbsOpts ), 611 member( Dir, ['dlps',Bims] ), 612 directory_file_path( Dir, DlpStem, DlpPathStem ), 613 file_name_extension( DlpPathStem, dlp, AbsDlp ), 614 debug( bims, 'Looking for prior in file: ~w', AbsDlp ), 615 exists_file( AbsDlp ), 616 debug( bims, 'Will be using prior in file: ~w', AbsDlp ), 617 !. 618bims_locate_prior_file( Model, DlpStem, _AbsDlp ) :- 619 throw( fixme(cannot_locate_prior(Model,DlpStem)) ). 620 621atom_singular( Atom, Singular ) :- 622 atom_concat( Singular, s, Atom ), 623 !. 624atom_singular( Atom, Atom ). 625 626bims_args_results_dir( Args, Dir ) :- 627 memberchk( results_dir(ResDir), Args ), 628 !, 629 Dir = ResDir, 630 bims_args_results_dir_given( ResDir, Args ). 631bims_args_results_dir( Args, Dir ) :- 632 bims_args_results_dir_constructed( Args, Dir ). 633 634bims_args_results_dir_given( ResDir, _Args ) :- 635 ground( ResDir ), 636 !. 637bims_args_results_dir_given( ResDir, Args ) :- 638 var( ResDir ), 639 !, 640 bims_args_results_dir_constructed( Args, ResDir ). 641bims_args_results_dir_given( ResDir, _Args ) :- 642 throw( partially_instantiated_results_dir(ResDir) ). 643 644bims_args_results_dir_constructed( Args, ResDir ) :- 645 ( memberchk(dir_prefix(Pfx),Args) -> true; Pfx = res ), 646 !, 647 os_unique_by_date( Pfx, ResDir ). 648 649bims_write_abs_options( OptsFile, abs(Lkl,Data,Dlp) ) :- 650 open( OptsFile, append, Out ), 651 nl( Out ), 652 write( Out, '% derived' ), nl( Out ), 653 Terms = [ likelihood_file(Lkl), 654 data_file(Data), 655 prior_file(Dlp) 656 ], 657 maplist( mcmcms_write_fact(Out), Terms ), 658 nl( Out ), 659 write( Out, '% date/time stamps' ), nl( Out ), 660 get_date_time( Now ), 661 mcmcms_write_fact( Out, started_at(Now) ), 662 close( Out ). 663 664bims_progress_reporter( Its, Pc, Stub ) :- 665 bims_bb_remove( progress, _ ), 666 number( Pc ), 667 0 < Pc, Pc < 100, 668 !, 669 calc_percentiles( Pc, Pc, Its, [HPtl|TPtiles] ), 670 bims_bb_put( progress, pts(Stub,HPtl,TPtiles) ). 671bims_progress_reporter( _Its, _Pc, _Stub ). 672 673bims_write_options( BroF, UnqOpts ) :- 674 open( BroF, write, Out ), 675 write( Out, '% options' ), nl( Out ), 676 maplist( mcmcms_write_fact(Out), UnqOpts ), 677 close( Out ). 678 679calc_percentiles( Perc, _Step, _Repeats, Ptiles ) :- 680 Perc > 100, 681 !, 682 Ptiles = []. 683calc_percentiles( Perc, Step, Repeats, [F|M] ) :- 684 F is integer( Perc * Repeats / 100 ), 685 NxP is Perc + Step, 686 calc_percentiles( NxP, Step, Repeats, M )
Bims- Bayesian inference over model structures.
Introduction
Bims (Bayesian inference over model structures) implements MCMC learning over statistical models defined in the Dlp (Distributional logic programming) probabilistic language.
Bims is released under GPL2, or Artistic 2.0
Currently there are 2 model spaces supported:
Additional model spaces can be easily implemented by defining new likelihood plug-ins and programming appropriate priors.
Examples provided
Carts examples
The above are two equivalent ways to run the Carts example provided.
This runs 3 chains each of length 100 on the default Carts data using the default likelihood. The default dataset is the breast cancer Winsconsin (BCW) data from the machine learning repository. There are 2 categories, 9 variables and 683 data points in this dataset. You can view the data with
The default likelihood is an implementation of the classification likelihood function presented in: H Chipman, E George, and R McCulloch. Bayesian CART model search (with discussion). J. of the American Statistical Association, 93:935â960, 1998.
Bns examples
The above are two equivalent ways to run the Bns example provided.
This runs 3 chains each of length 100 on the default bns data using default likelihood. The dataset is a sampled dataset from the ASIA network and it comprises of 8 variables and 2295 datapoints. You can view the data with
The default BN likelihood is an instance of the BDeu metric for scoring BN structures.
W. L. Buntine. Theory refinement of Bayesian networks. In Bruce DâAmbrosio, Philippe Smets, and Piero Bonissone, editors, Proceedings of the Seventh Annual Conference on Uncertainty in Artificial Intelligence (UAIâ1991), pages 52â60, 1991
David Heckerman, Dan Geiger, and David M. Chickering. Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 20(3):197â243, 1995.
Learning models from new datasets
An easy way to run Bims on your data is to create a new directory and within that sub-directory data/ copy your data there and pass options data/1 to the basename of the data file.
For example,
Learning new statistical models.
By defining a new likelihood function and new priors the system can be used on new statistical models.
Resolution
In addition to model structure learning Bims implements two way of performing resolution over DLPs: stochastic sampling resolution (SSD) and SLD-based probabilisic inference.
Stochastic sampling definite clause (SSD) resolution
These predicates allow to sample from a loaded distributional logic program (Dlp). The resolution strategy here are that of chosing between probabilistic choices according to their relative values. The main idea is that when sampling many times from a top goal will in the long run sample each derivation path in proportion to the probability of the derivation. The probability of a derivation/refutation, is simply the product of all the probabilities attached to resolution steps during the derivation.
See
SLD-based probabilisic inference
These predicates allow standard SLD exploration of a stochastic query against a DLP. Predicates here allow to explore what is derivable and often attach a probability and ather information to each derivation.
Note that in probabilistic inference we often are more interested in failures than in standard LP. This is because there is a probability mass loss which each failed probabilistic branch.
Probabilistic inference predicates
Predicates index
Pack info
bims_default(-Def)
.