Publications équipe HPC-Net - 2010

Articles

  1. Jalel Ben-Othman and Yahya Bashir. Energy efficient and QoS based routing protocol for wireless sensor networks. Journal of Parallel and Distributed Computing 70(8):849\-857, août 2010. BibTeX

    @article{BEYA10a,
    	author = "Ben-Othman, Jalel and Bashir, Yahya",
    	title = "Energy efficient and {QoS} based routing protocol for wireless sensor networks",
    	journal = "Journal of Parallel and Distributed Computing",
    	year = 2010,
    	volume = 70,
    	pages = "849{\-}857",
    	number = 8,
    	month = "August",
    	annote = "revint",
    	owner = "MOIS",
    	timestamp = "2011.07.25"
    }
    
  2. Jalel Ben-Othman and Yahya Bashir. Energy Efficient and QoS Aware Medium Access Control for Wireless Sensor Networks. Wiley Concurrency and Computation-Practice and Experience 22(10):1252\-1266, juillet 2010. BibTeX

    @article{BEYA10b,
    	author = "Ben-Othman, Jalel and Bashir, Yahya",
    	title = "Energy {E}fficient and {QoS} {A}ware {M}edium {A}ccess {C}ontrol for {W}ireless {S}ensor {N}etworks",
    	journal = "Wiley Concurrency and Computation-Practice and Experience",
    	year = 2010,
    	volume = 22,
    	pages = "1252{\-}1266",
    	number = 10,
    	month = "July",
    	annote = "revint",
    	owner = "MOIS"
    }
    
  3. Marouane Belaoucha, Denis Barthou, Adrien Eliche and Sid-Ahmed-Ali Touati. FADAlib: an open source C++ library for fuzzy array dataflow analysis. Procedia Computer Science, pages 2075\-2084, mai 2010. URL BibTeX

    @article{BBET10,
    	author = "Belaoucha, Marouane and Barthou, Denis and Eliche, Adrien and Touati, Sid-Ahmed-Ali",
    	title = "{FADAlib:} an open source {C++} library for fuzzy array dataflow analysis",
    	journal = "Procedia Computer Science",
    	year = 2010,
    	pages = "2075{\-}2084",
    	month = "May",
    	address = "Amsterdam, Pays{\-}Bas",
    	affiliation = "Parall{\'e}lisme, R{\'e}seaux, Syst{\`e}mes d'information, Mod{\'e}lisation- PRISM - CNRS : UMR8144 - Universit{\'e} de Versailles{\-}Saint Quentin en Yvelines {\-} ALCHEMY {\-} INRIA Saclay {\-} Ile de France {\-} INRIA {\-} CNRS{\-} UMR8623 {\-} Universit{\'e} Paris Sud {\-} Paris XI",
    	annote = "revint",
    	audience = "internationale",
    	hal_id = "hal{\-}00551673",
    	language = "Anglais",
    	owner = "MOIS",
    	timestamp = "2011.07.25",
    	url = "http://hal.archives{\-}ouvertes.fr/hal{\-}00551673/en/"
    }
    
  4. Rola Naja and Abed Ahmad. On Dynamic Resource allocation in WiMAX Networks. INFOCOMP Journal 9(3):42-51, 2010. BibTeX

    @article{NAAH10,
    	author = "Naja, Rola and Ahmad, Abed",
    	title = "On {D}ynamic {R}esource allocation in {WiMAX} {N}etworks",
    	journal = "INFOCOMP Journal",
    	year = 2010,
    	volume = 9,
    	pages = "42-51",
    	number = 3,
    	annote = "revint",
    	owner = "MOIS",
    	timestamp = "2012.02.06"
    }
    
  5. Nabila Labraoui, Abdelhak Gueroui, Makhlouf Aliouat and Zia Tanveer. Data Aggregation Security challenge in Wireless Sensor Networks A Survey. AHSWN Journal \- Adhoc & Sensor Wireless Networks, 2010. BibTeX

    @article{AGLT10,
    	author = "Labraoui, Nabila and Gueroui, Abdelhak and Aliouat, Makhlouf and Tanveer, Zia",
    	title = "Data {A}ggregation {S}ecurity challenge in {W}ireless {S}ensor {N}etworks {A} {S}urvey",
    	journal = "AHSWN Journal {\-} Adhoc {\&} Sensor Wireless Networks",
    	year = 2010,
    	annote = "revint",
    	owner = "MOIS",
    	timestamp = "2012.07.04"
    }
    
  6. Peter G Harrison, Naresh M Patel and Soraya Zertal. Response time distribution of flash memory accesses. Performance Evaluation 67(4):248\-259, 2010. BibTeX

    @article{HPZE10,
    	author = "Peter G. Harrison and Naresh M. Patel and Soraya Zertal",
    	title = "Response time distribution of flash memory accesses",
    	journal = "Performance Evaluation",
    	year = 2010,
    	volume = 67,
    	pages = "248{\-}259",
    	number = 4,
    	annote = "revint"
    }
    
  7. Jalel Ben-Othman and Lynda Mokdad. Enhancing data security in ad hoc networks based on multipath routing. Elseiver Journal of Parallel and Distributed Computing 70(3):309-316, 2010. BibTeX

    @article{BEMO10,
    	author = "Ben-Othman, Jalel and Mokdad, Lynda",
    	title = "Enhancing data security in ad hoc networks based on multipath routing",
    	journal = "Elseiver Journal of Parallel and Distributed Computing",
    	year = 2010,
    	volume = 70,
    	pages = "309-316",
    	number = 3,
    	month = "Mars",
    	annote = "revint",
    	owner = "MOIS",
    	timestamp = "2011.07.25"
    }
    

Book

  1. Claude Timsit. Du transistor à l'ordinateur. Hermann, 2010. BibTeX

    @book{TICL10,
    	title = "Du transistor {\`a} l'ordinateur",
    	publisher = "Hermann",
    	year = 2010,
    	author = "Timsit, Claude",
    	annote = "livre",
    	isbn = "978 2 7056 6972 0",
    	owner = "MOIS",
    	timestamp = "2012.01.31"
    }
    

Inbooks

  1. L Shang, S Petiton, N Emad and X Yang. Cloud Computing: Principles, Systems and Applications. Chapter YML-PC: A Reference Architecture Based on Workflow for Building Scientific Private Clouds, pages 145-162, Springer-Verlag, 2010. DOI BibTeX

    @inbook{SPEY10,
    	chapter = "{YML}-{PC}: {A} {R}eference {A}rchitecture {B}ased on {W}orkflow for {B}uilding {S}cientific {P}rivate {C}louds",
    	pages = "145-162",
    	title = "Cloud Computing: Principles, Systems and Applications",
    	publisher = "Springer-Verlag",
    	year = 2010,
    	author = "Shang, L. and Petiton, S. and Emad, N. and Yang, X.",
    	annote = "chapitre",
    	doi = "http://dx.doi.org/10.1007/978-1-84996-241-4"
    }
    

Inproceedings

  1. Rola Naja, Abdulkader Oubari and Samir Tohme. WiMAX and HSDPA Networks Playing Vertical Mobility Games. In IEEE Seventh International Conference on Wireless and Optical Communications Networks. septembre 2010, 1-5. DOI BibTeX

    @inproceedings{NOTO10,
    	author = "Naja, Rola and Oubari, Abdulkader and Tohme, Samir",
    	title = "{WiMAX} and {HSDPA} {N}etworks {P}laying {V}ertical {M}obility {G}ames",
    	booktitle = "IEEE Seventh International Conference on Wireless and Optical Communications Networks",
    	year = 2010,
    	pages = "1-5",
    	month = "September",
    	annote = "confint",
    	doi = "10.1109/WOCN02010.5587340",
    	owner = "MOIS",
    	timestamp = "2012.02.06"
    }
    
  2. Rola Naja. How to Bridge the Knowledge Gap and the Digital Divide Between the North and the South. In International Conference Biovision. avril 2010. BibTeX

    @inproceedings{NARO10,
    	author = "Naja, Rola",
    	title = "How to {B}ridge the {K}nowledge {G}ap and the {D}igital {D}ivide {B}etween the {N}orth and the {S}outh",
    	booktitle = "International Conference Biovision",
    	year = 2010,
    	month = "April",
    	publisher = "Bibliotheca Alexandrina",
    	annote = "confint",
    	owner = "MOIS",
    	timestamp = "2012.02.06"
    }
    
  3. Stephane Zuckerman and William Jalby. Tacking Cache-Line Stealing Effects Using Run-Time Adaptation. In LCPC2010: Proceedings of the 23rd International Conference on Languagesand Compilers for Parallel Computing. 2010, 62\-76. BibTeX

    @inproceedings{ZUJA10,
    	author = "Zuckerman, Stephane and Jalby, William",
    	title = "Tacking {C}ache-{L}ine {S}tealing {E}ffects {U}sing {R}un-{T}ime {A}daptation",
    	booktitle = "LCPC2010: Proceedings of the 23rd International Conference on Languagesand Compilers for Parallel Computing",
    	year = 2010,
    	pages = "62{\-}76",
    	publisher = "Springer{\-}Verlag",
    	annote = "confint",
    	owner = "MOIS",
    	timestamp = "2011.07.28"
    }
    
  4. Soraya Zertal and Wilfierd Dron. Quantitative Study of Solid State Disks for Mass Storage. In International Symposium on Performance Evaluation of Computer & Telecommunication Systems (SPECTS). 2010, 149-155. BibTeX

    @inproceedings{ZEDR10,
    	author = "Zertal, Soraya and Dron, Wilfierd",
    	title = "Quantitative {S}tudy of {S}olid {S}tate {D}isks for {M}ass {S}torage",
    	booktitle = "International Symposium on Performance Evaluation of Computer {\&} Telecommunication Systems (SPECTS)",
    	year = 2010,
    	pages = "149-155",
    	annote = "confint"
    }
    
  5. Bashir Yahya and Jalel Ben-Othman. A peer-to-peer based named system for Mobile Ad Hoc Networks. In The 35th IEEE Conference on Local Computer Networks (LCN). 2010, 821-826. BibTeX

    @inproceedings{YABE10,
    	author = "Yahya, Bashir and Ben-Othman, Jalel",
    	title = "A peer-to-peer based named system for {M}obile {Ad Hoc} {N}etworks",
    	booktitle = "The 35th IEEE Conference on Local Computer Networks (LCN)",
    	year = 2010,
    	pages = "821-826",
    	month = "October 11th-14th",
    	annote = "confint",
    	owner = "MOIS",
    	timestamp = "2011.07.25"
    }
    
  6. Claude Timsit and Soraya Zertal. Using Spreadsheets to Teach Computer Architecture. In International Conference on Computer Supported Education (CSEDU). 2010. BibTeX

    @inproceedings{TIZE10,
    	author = "Timsit, Claude and Zertal, Soraya",
    	title = "Using {S}preadsheets to {T}each {C}omputer {A}rchitecture",
    	booktitle = "International Conference on Computer Supported Education (CSEDU)",
    	year = 2010,
    	annote = "confint",
    	owner = "MOIS",
    	timestamp = "2011.11.04"
    }
    
  7. Abdelhafid Mazouz, Sid-Ahmed-Ali Touati and Denis Barthou. Study of Variations of Native Program Execution Times on Multi-core Architectures. In International IEEE Conference on Complex, Intelligent and Software Intensive Systems. 2010, 919-924. URL BibTeX

    @inproceedings{MTBA10b,
    	author = "Mazouz, Abdelhafid and Touati, Sid-Ahmed-Ali and Barthou, Denis",
    	title = "Study of {V}ariations of {N}ative {P}rogram {E}xecution {T}imes on {M}ulti-core {A}rchitectures",
    	booktitle = "International IEEE Conference on Complex, Intelligent and Software Intensive Systems",
    	year = 2010,
    	pages = "919-924",
    	month = "Feb.",
    	affiliation = "Parall{\'e}lisme, R{\'e}seaux, Syst{\`e}mes d'information, Mod{\'e}lisation{\-}PRISM{\-}CNRS{\-}UMR8144{\-} Universit{\'e} de Versailles{\-}Saint Quentin en Yvelines {\-} ALCHEMY {\-} INRIA Saclay {\-} Ile de France {\-} INRIA {\-} CNRS{\-} UMR8623 Universit{\'e} Paris Sud {\-} Paris XI",
    	annote = "confint",
    	audience = "internationale",
    	hal_id = "hal{\-}00551581",
    	language = "Anglais",
    	owner = "MOIS",
    	timestamp = "2011.07.25",
    	url = "http://hal.archives{\-}ouvertes.fr/hal{\-}00551581/en/"
    }
    
  8. Nabila Labraoui, Abdelhak Gueroui, J Petit and Makhlouf Aliouat. Adaptive Security Level for Data Aggregation in Wireless Sensor Networks. In IEEE 5th International Symposium on Wireless Persuasive Computing-ISWPC. 2010. BibTeX

    @inproceedings{AGLP10,
    	author = "Labraoui, Nabila and Gueroui, Abdelhak and Petit, J. and Aliouat, Makhlouf",
    	title = "Adaptive {S}ecurity {L}evel for {D}ata {A}ggregation in {W}ireless {S}ensor {N}etworks",
    	booktitle = "IEEE 5th International Symposium on Wireless Persuasive Computing-ISWPC",
    	year = 2010,
    	month = "May 5-7 th",
    	annote = "confint",
    	owner = "MOIS",
    	timestamp = "2012.07.04"
    }
    
  9. Kinda Khawam and Dana Marinca. Size-based Proportional Fair Scheduling. In PIMRC 2010. 2010, 1781-1786. BibTeX

    @inproceedings{KHMA10,
    	author = "Khawam, Kinda and Marinca, Dana",
    	title = "Size-based {P}roportional {F}air {S}cheduling",
    	booktitle = "PIMRC 2010",
    	year = 2010,
    	pages = "1781-1786",
    	annote = "confint",
    	owner = "MOIS",
    	timestamp = "2011.07.28"
    }
    
  10. Kinda Khawam, Marc Ibrahim and Tohme Samir. Centralised multi-class Access Control in a WiMAX-UMTS hybrid network. In IEEE PIMRC 2010. 2010, 1265\-1270. BibTeX

    @inproceedings{KITO10,
    	author = "Khawam, Kinda and Ibrahim, Marc and Tohme Samir",
    	title = "Centralised multi-class {A}ccess {C}ontrol in a {WiMAX-UMTS} hybrid network",
    	booktitle = "IEEE PIMRC 2010",
    	year = 2010,
    	pages = "1265{\-}1270",
    	annote = "confint",
    	owner = "MOIS",
    	timestamp = "2011.07.27"
    }
    
  11. Nora Izri, Jean-Michel Fourneau, Dana Marinca and Samir Tohme. Impact de la Convergence Fixe-Mobile sur un réseau optique de collecte. In 11èmes Journes Doctorales en Informatique et Réseaux \-JDIR. 2010.
    Abstract La fibre optique est un élément incontournable dans les réseaux de recherche actuels, puisqu'elle permet de combiner l'allocation dynamique des ressources aux hauts débits offerts. Pour cela, nous présentons un anneau optique où l'on permet la garantie de QoS (Quality of Service) au niveau des sous-longueurs d'onde. L'architecture est bas ée sur un mécanisme de réservations centralisé au niveau du hub. Ce dernier a la fonctionnalité d'ordonnanceur vu qu'il définit l'ensemble des rectangles de réservations pour un noeud, appelé pattern. Deux types de trafic peuvent circuler dans ce réseau : le trafic provenant des réseaux fixes et le trafic provenant des réseaux mobiles. Dans ce papier, un concept de convergence fixe-mobile (CFM) au niveau de la couche physique dans un réseau de collecte tout optique est décrit. Une analyse des scénarii de CFM proposés a été faite. Nous étudions également l'attente des paquets avant leur insertion sur l'anneau optique en utilisant des PDUs (Physical Data Unit) de taille fixe, ainsi que le taux d'utilisation du container optique PDU par les paquets clients IP, ATM ou GSM. Nos résultats montrent que la répartition des slots dans la fen être a un impact sur le temps d'attente des paquets clients. D'autres part, le partage des containers optiques entre les CoS (Class of Service) quel que soit le type de réseau, a un effet majeur sur leur taux d'utilisation. URL BibTeX

    @inproceedings{FIMT10,
    	author = "Izri, Nora and Fourneau, Jean-Michel and Marinca, Dana and Tohme, Samir",
    	title = "Impact de la {C}onvergence {F}ixe-{M}obile sur un r{\'e}seau optique de collecte",
    	booktitle = "11{\`e}mes Journes Doctorales en Informatique et R{\'e}seaux {\-}JDIR",
    	year = 2010,
    	abstract = "La fibre optique est un {\'e}l{\'e}ment incontournable dans les r{\'e}seaux de recherche actuels, puisqu'elle permet de combiner l'allocation dynamique des ressources aux hauts d{\'e}bits offerts. Pour cela, nous pr{\'e}sentons un anneau optique o{\`u} l'on permet la garantie de QoS (Quality of Service) au niveau des sous-longueurs d'onde. L'architecture est bas {\'e}e sur un m{\'e}canisme de r{\'e}servations centralis{\'e} au niveau du hub. Ce dernier a la fonctionnalit{\'e} d{'}ordonnanceur vu qu{'}il d{\'e}finit l'ensemble des rectangles de r{\'e}servations pour un noeud, appel{\'e} pattern. Deux types de trafic peuvent circuler dans ce r{\'e}seau : le trafic provenant des r{\'e}seaux fixes et le trafic provenant des r{\'e}seaux mobiles. Dans ce papier, un concept de convergence fixe-mobile (CFM) au niveau de la couche physique dans un r{\'e}seau de collecte tout optique est d{\'e}crit. Une analyse des sc{\'e}narii de CFM propos{\'e}s a {\'e}t{\'e} faite. Nous {\'e}tudions {\'e}galement l'attente des paquets avant leur insertion sur l'anneau optique en utilisant des PDUs (Physical Data Unit) de taille fixe, ainsi que le taux d'utilisation du container optique PDU par les paquets clients IP, ATM ou GSM. Nos r{\'e}sultats montrent que la r{\'e}partition des slots dans la fen {\^e}tre a un impact sur le temps d'attente des paquets clients. D'autres part, le partage des containers optiques entre les CoS (Class of Service) quel que soit le type de r{\'e}seau, a un effet majeur sur leur taux d'utilisation.",
    	affiliation = "Parall{\'e}lisme, R{\'e}seaux, Syst{\`e}mes d'information, Mod{\'e}lisation - PRISM - CNRS : UMR8144 - Universit{\'e} de Versailles-Saint Quentin en Yvelines",
    	annote = "confnat",
    	audience = "internationale",
    	file = "jdir2010IZRIvf.pdf:http\://hal.inria.fr/inria-00467940/PDF/ jdir2010IZRIvf.pdf:PDF",
    	hal_id = "inria-00467940",
    	language = "Fran{\c c}ais",
    	owner = "MOIS",
    	timestamp = "2011.07.25",
    	url = "http://hal.inria.fr/inria-00467940/en/"
    }
    
  12. Marc Ibrahim, Kinda Khawam and Samir Tohme. Congestion Games for Distributed Radio Access Selection in Broadband Networks. In IEEE GLOBECOM 2010. 2010, 1\-5. BibTeX

    @inproceedings{IKTO10,
    	author = "Ibrahim, Marc and Khawam, Kinda and Tohme, Samir",
    	title = "Congestion {G}ames for {D}istributed {R}adio {A}ccess {S}election in {B}roadband {N}etworks",
    	booktitle = "IEEE GLOBECOM 2010",
    	year = 2010,
    	pages = "1{\-}5",
    	annote = "confint",
    	owner = "MOIS",
    	timestamp = "2011.07.27"
    }
    
  13. Yuanjie Huang, Liang Peng, Chengyong Wu, Yury Kashnikov, Jorn Rennecke and Grigori Fursin. Transforming GCC into a research-friendly environment: plugins for optimization tuning and reordering, function cloning and program instrumentation. In 2nd International Workshop on GCC Research Opportunities (GROW'10). 2010. Google Summer of Code'09.
    Abstract Computer scientists are always eager to have a powerful, robust and stable compiler infrastructure. However, until recently, researchers had to either use available and often unstable research compilers, create new ones from scratch, try to hack open- source non-research compilers or use source to source tools. It often requires duplication of a large amount of functionality available in current production compilers while making questionable the practicality of the obtained research results. The Interactive Compilation Interface (ICI) has been introduced to avoid such time-consuming replication and transform popular, production compilers such as GCC into research toolsets by providing an ability to access, modify and extend GCC's internal functionality through a compiler-dependent hook and clear compiler- independent API with external portable plugins without interrupting the natural evolution of a compiler. In this paper, we describe our recent extensions to GCC and ICI with the preliminary experimental data to support selection and reordering of optimization passes with a dependency grammar, control of individual transformations and their parameters, generic function cloning and program instrumentation. We are synchronizing these developments implemented during Google Summer of Code'09 program with the mainline GCC 4.5 and its native low-level plugin system. These extensions are intended to enable and popularize the use ofGCC for realistic research on empirical iterative feedback- directed compilation, statistical collective optimization, run- time adaptation and development of intelligent self-tuning computing systems among other important topics. Such research infrastructure should help researchers prototype and validate their ideas quickly in realistic, production environments while keeping portability of their research plugins across different releases of a compiler. Moreover, it should also allow to move successful ideas back to GCC much faster thus helping to improve, modularize and clean it up. Furthermore, we are porting GCC with ICI extensions for performance/power auto-tuning for data centers and cloud computing systems with heterogeneous architectures or for continuous whole system optimization. URL BibTeX

    @inproceedings{HPW+10,
    	author = "Huang, Yuanjie and Peng, Liang and Wu, Chengyong and Kashnikov, Yury and Rennecke, Jorn and Fursin, Grigori",
    	title = "Transforming {GCC} into a research-friendly environment: plugins for optimization tuning and reordering, function cloning and program instrumentation",
    	booktitle = "2nd {I}nternational {W}orkshop on {GCC} {R}esearch {O}pportunities (GROW'10)",
    	year = 2010,
    	month = "Jan.",
    	note = "Google Summer of Code'09",
    	abstract = "Computer scientists are always eager to have a powerful, robust and stable compiler infrastructure. However, until recently, researchers had to either use available and often unstable research compilers, create new ones from scratch, try to hack open- source non-research compilers or use source to source tools. It often requires duplication of a large amount of functionality available in current production compilers while making questionable the practicality of the obtained research results. The Interactive Compilation Interface (ICI) has been introduced to avoid such time-consuming replication and transform popular, production compilers such as GCC into research toolsets by providing an ability to access, modify and extend GCC's internal functionality through a compiler-dependent hook and clear compiler- independent API with external portable plugins without interrupting the natural evolution of a compiler. In this paper, we describe our recent extensions to GCC and ICI with the preliminary experimental data to support selection and reordering of optimization passes with a dependency grammar, control of individual transformations and their parameters, generic function cloning and program instrumentation. We are synchronizing these developments implemented during Google Summer of Code'09 program with the mainline GCC 4.5 and its native low-level plugin system. These extensions are intended to enable and popularize the use ofGCC for realistic research on empirical iterative feedback- directed compilation, statistical collective optimization, run- time adaptation and development of intelligent self-tuning computing systems among other important topics. Such research infrastructure should help researchers prototype and validate their ideas quickly in realistic, production environments while keeping portability of their research plugins across different releases of a compiler. Moreover, it should also allow to move successful ideas back to GCC much faster thus helping to improve, modularize and clean it up. Furthermore, we are porting GCC with ICI extensions for performance/power auto-tuning for data centers and cloud computing systems with heterogeneous architectures or for continuous whole system optimization.",
    	affiliation = "Institute of Computing Technology - Chinese Academy of Science-ICT-Chinese Academy of Science (CAS)- Parall{\'e}lisme, R{\'e}seaux, Syst{\`e}mes d'information, Mod{\'e}lisation - PRISM - CNRS : UMR8144 -Universit{\'e} de Versailles-Saint Quentin en Yvelines - ALCHEMY-INRIA Saclay- Ile de France - INRIA - CNRS : UMR8623 - Universit{\'e} Paris Sud - Paris XI",
    	annote = "confint",
    	audience = "internationale",
    	file = "hpwp2010.pdf:http\://hal.inria.fr/inria-00451106/PDF/ hpwp2010.pdf:PDF",
    	hal_id = "inria-00451106",
    	language = "Anglais",
    	owner = "MOIS",
    	timestamp = "2011.07.25",
    	url = "http://hal.inria.fr/inria-00451106/en/"
    }
    
  14. Nahid Emad, Olivier Delannoy and Makarem Dandouna. A design approach for numerical libraries in large scale distributed systems. In Proceedings of the ACS/IEEE International Conference on Computer Systems and Applications - AICCSA 2010. 2010. URL, DOI BibTeX

    @inproceedings{EDDA10a,
    	author = "Emad, Nahid and Delannoy, Olivier and Dandouna, Makarem",
    	title = "A design approach for numerical libraries in large scale distributed systems",
    	booktitle = "Proceedings of the ACS/IEEE International Conference on Computer Systems and Applications - AICCSA 2010",
    	year = 2010,
    	series = "AICCSA2010",
    	address = "Washington, DC, USA",
    	publisher = "IEEE Computer Society",
    	acmid = 1908471,
    	annote = "confint",
    	doi = "http://dx.doi.org/10.1109/AICCSA.2010.5586951",
    	isbn = "978{\-}1{\-}4244{\-}7716{\-}6",
    	numpages = 9,
    	url = "http://dx.doi.org/10.1109/AICCSA.2010.5586951"
    }
    
  15. N Emad. Multiple Restarted Arnoldi Methods to Solve Eigenproblem. In the Proceedings of the 11th Copper Mountain Conference on Iterative Methods. 2010. BibTeX

    @inproceedings{EMNA10,
    	author = "N. Emad",
    	title = "{M}ultiple {R}estarted {A}rnoldi {M}ethods to {S}olve {E}igenproblem",
    	booktitle = "the Proceedings of the 11th Copper Mountain Conference on Iterative Methods",
    	year = 2010,
    	month = "April 4-9",
    	annote = "confint"
    }
    
  16. Majed Chatti, S Yahia, Claude Timsit and Soraya Zertal. A Hypercube-Based NoC Routing Algorithm for Efficient All-to -All Communications in Embedded Image and Signal Processing Applications. In IEEE Conference on High Performance Computing and Simulation (HPCS). 2010. Best Short Paper Award. BibTeX

    @inproceedings{CTYZ10,
    	author = "Chatti, Majed and Yahia, S. and Timsit, Claude and Zertal, Soraya",
    	title = "A {H}ypercube-{B}ased {NoC} {R}outing {A}lgorithm for {E}fficient {A}ll-to -{A}ll {C}ommunications in {E}mbedded {I}mage and {S}ignal {P}rocessing {A}pplications",
    	booktitle = "IEEE Conference on High Performance Computing and Simulation (HPCS)",
    	year = 2010,
    	note = "Best Short Paper Award",
    	annote = "confint",
    	owner = "MOIS",
    	timestamp = "2011.11.04"
    }
    
  17. Frederic Brault, Benoit Dupont-De-Dinechin, Sid- Ahmed-Ali Touati and Albert Cohen. Software Pipelining and Register Pressure in VLIW Architectures: Preconditionning Data Dependence Graphs is Experimentally Better Than Lifetime-Sensitive Scheduling. In 8th Workshop on Optimizations for DSP and Embedded Systems (ODES'10). 2010.
    Abstract Embedding register-pressure control in software pipelining heuristics is the dominant approach in modern back-end compilers. However, aggressive attempts at combining resource and register constraints in software pipelining have failed to scale to real\-life loops, leaving weaker heuristics as the only practical solutions. We propose a decoupled approach where register pressure is controlled before scheduling, and evaluate its effectiveness in combination with three representative software pipelining algorithms. We present conclusive experiments in a production compiler on a wealth of media processing and general purpose benchmarks. URL BibTeX

    @inproceedings{BCD+10,
    	author = "Brault, Frederic and Dupont-De-Dinechin, Benoit and Touati, Sid- Ahmed-Ali and Cohen, Albert",
    	title = "Software {P}ipelining and {R}egister {P}ressure in {VLIW} {A}rchitectures: Preconditionning {D}ata {D}ependence {G}raphs is {E}xperimentally {B}etter {T}han {L}ifetime-{S}ensitive {S}cheduling",
    	booktitle = "8th Workshop on Optimizations for DSP and Embedded Systems (ODES'10)",
    	year = 2010,
    	month = "Apr.",
    	abstract = "Embedding register-pressure control in software pipelining heuristics is the dominant approach in modern back-end compilers. However, aggressive attempts at combining resource and register constraints in software pipelining have failed to scale to real{\-}life loops, leaving weaker heuristics as the only practical solutions. We propose a decoupled approach where register pressure is controlled before scheduling, and evaluate its effectiveness in combination with three representative software pipelining algorithms. We present conclusive experiments in a production compiler on a wealth of media processing and general purpose benchmarks.",
    	affiliation = "ALCHEMY {\-} INRIA Saclay {\-} Ile de France {\-} INRIA {\-} CNRS {\-} UMR8623 {\-} Universit{\'e} Paris Sud {\-} Paris XI {\-} Kalray {\-} Parall{\'e}lisme, R{\'e}seaux, Syst{\`e}mes d'information, Mod{\'e}lisation {\-} PRISM {\-} CNRS {\-} UMR8144 {\-}Universit{\'e} de Versailles{\-}Saint Quentin en Yvelines",
    	annote = "confint",
    	audience = "internationale",
    	file = "SubmitODES2010.pdf:http\://hal.inria.fr/inria-00551515/PDF/ SubmitODES2010.pdf:PDF",
    	hal_id = "inria-00551515",
    	language = "Anglais",
    	owner = "MOIS",
    	timestamp = "2011.07.25",
    	url = "http://hal.inria.fr/inria{\-}00551515/en/"
    }
    
  18. Jalel Ben-Othman and B Yahiya. RELAX : An Energy Efficient Multipath Routing Protocol for Wireless Sensor Networks. In in proceedings of the IEEE International Conference on communications in Wireless networks symposium (ICC 10).. 2010. BibTeX

    @inproceedings{BEYA10d,
    	author = "Ben-Othman, Jalel and Yahiya, B.",
    	title = "{RELAX} : {A}n {E}nergy {E}fficient {M}ultipath {R}outing {P}rotocol for {W}ireless {S}ensor {N}etworks",
    	booktitle = "in proceedings of the IEEE International Conference on communications in Wireless networks symposium (ICC 10).",
    	year = 2010,
    	month = "23-27 May",
    	annote = "confint",
    	owner = "MOIS",
    	timestamp = "2012.02.10"
    }
    
  19. Jalel Ben-Othman and B Yahiya. A Peer-to-Peer Based Naming System for Mobile Ad Hoc Networks. In In proceedings of the 35rd IEEE Conference on Local Computer Networks, 2010, (LCN2010). 2010. BibTeX

    @inproceedings{BEYA10C,
    	author = "Ben-Othman, Jalel and Yahiya, B.",
    	title = "A {P}eer-to-{P}eer {B}ased {N}aming {S}ystem for {M}obile {Ad Hoc} {N}etworks",
    	booktitle = "In proceedings of the 35rd IEEE Conference on Local Computer Networks, 2010, (LCN2010)",
    	year = 2010,
    	month = "11-14 October",
    	annote = "confint",
    	owner = "MOIS",
    	timestamp = "2012.02.10"
    }
    
  20. Jalel Ben-Othman, Lynda Mokdad and Mohamed Ould Cheikh. On Improving the performance of IEEE 802. 11s based wireless mesh networks using directional antenna. In The 35th IEEE Conference on Local Computer Networks (LCN). 2010, 785\-790. BibTeX

    @inproceedings{BMOU10,
    	author = "Ben-Othman, Jalel and Mokdad, Lynda and Ould Cheikh, Mohamed",
    	title = "On {I}mproving the performance of {IEEE} 802. 11s based wireless mesh networks using directional antenna",
    	booktitle = "The 35th IEEE Conference on Local Computer Networks (LCN)",
    	year = 2010,
    	pages = "785{\-}790",
    	organization = "IEEE",
    	annote = "confint",
    	owner = "MOIS",
    	timestamp = "2011.07.25"
    }
    
  21. Jalel Ben-Othman and S Ghazal. Trac Policing Based on Token Bucket Mechanism for WiMAX Networks. In in proceedings of the IEEE International Conference on communications in Wireless networks symposium (ICC 10).. 2010. BibTeX

    @inproceedings{BEGH10,
    	author = "Ben-Othman, Jalel and Ghazal, S.",
    	title = "Trac {P}olicing {B}ased on {T}oken {B}ucket {M}echanism for {WiMAX} {N}etworks",
    	booktitle = "in proceedings of the IEEE International Conference on communications in Wireless networks symposium (ICC 10).",
    	year = 2010,
    	month = "23-27 May",
    	annote = "confint",
    	owner = "MOIS",
    	timestamp = "2012.02.10"
    }
    
  22. Jalel Ben-Othman, Serigne Diagne, Lynda Mokdad and Bashir Yahya. Performance evaluation of a hybrid MAC protocol for wireless sensor networks. In The 13-th ACM International Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems. 2010, 327-334. BibTeX

    @inproceedings{BDMY10,
    	author = "Ben-Othman, Jalel and Diagne, Serigne and Mokdad, Lynda and Yahya, Bashir",
    	title = "Performance evaluation of a hybrid {MAC} protocol for wireless sensor networks",
    	booktitle = "The 13-th ACM International Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems",
    	year = 2010,
    	pages = "327-334",
    	month = "October 11th-14th",
    	annote = "confint",
    	owner = "MOIS",
    	timestamp = "2011.07.25"
    }
    
  23. Jean-Christian Angles D'Auriac, Denis Barthou, Damir Becirevic, Rene Bilhaut, Franois Bodin, Philippe Boucaud, Olivier Brand-Foissac, Jaume Carbonell, Christine Eisenbeis, P Gallard, Gilbert Grosdidier, P Guichon, P F Honore, G Le Meur, P Pene, L Rilling, P Roudeau, André Seznec and A Stocchi. Towards the Petaflop for Lattice QCD Simulations the PetaQCD Project. In J Gruntorad and M Lokajicek (eds.). Journal of Physics Conference Series 219. 2010, 052021.
    Abstract The study and design of a very ambitious petaflop cluster exclusively dedicated to Lattice QCD simulations started in early 08 among a consortium of 7 laboratories (IN2P3, CNRS, INRIA, CEA) and 2 SMEs. This consortium received a grant from the French ANR agency in July 08, and the PetaQCD project kickoff took place in January 09. Building upon several years of fruitful collaborative studies in this area, the aim of this project is to demonstrate that the simulation of a 256 x 1283 lattice can be achieved through the HMC/ETMC software, using a machine with efficient speed/cost/reliability/power consumption ratios. It is expected that this machine can be built out of a rather limited number of processors (e.g. between 1000 and 4000), although capable of a sustained petaflop CPU performance. The proof-of-concept should be a mock-up cluster built as much as possible with off-the-shelf components, and 2 particularly attractive axis will be mainly investigated, in addition to fast all-purpose multi-core processors: the use of the new brand of IBM-Cell processors (with on-chip accelerators) and the very recent Nvidia GP-GPUs (off-chip co-processors). This cluster will obviously be massively parallel, and heterogeneous. Communication issues between processors, implied by the Physics of the simulation and the lattice partitioning, will certainly be a major key to the project. URL, DOI BibTeX

    @inproceedings{ABB+10,
    	author = "Angles D'Auriac, Jean-Christian and Barthou, Denis and Becirevic, Damir and Bilhaut, Rene and Bodin, Franois and Boucaud, Philippe and Brand-Foissac, Olivier and Carbonell, Jaume and Eisenbeis, Christine and Gallard, P. and Grosdidier, Gilbert and Guichon, P. and Honore, P.F. and Le Meur, G. and Pene, P. and Rilling, L. and Roudeau, P. and Seznec, Andr{\'e} and Stocchi, A.",
    	title = "Towards the {P}etaflop for {L}attice {QCD} {S}imulations the {P}eta{QCD} {P}roject",
    	booktitle = "Journal of Physics Conference Series",
    	year = 2010,
    	editor = "Gruntorad, J. and Lokajicek, M.",
    	volume = 219,
    	pages = 052021,
    	publisher = "IOP Publishing",
    	abstract = "The study and design of a very ambitious petaflop cluster exclusively dedicated to Lattice QCD simulations started in early 08 among a consortium of 7 laboratories (IN2P3, CNRS, INRIA, CEA) and 2 SMEs. This consortium received a grant from the French ANR agency in July 08, and the PetaQCD project kickoff took place in January 09. Building upon several years of fruitful collaborative studies in this area, the aim of this project is to demonstrate that the simulation of a 256 x 1283 lattice can be achieved through the HMC/ETMC software, using a machine with efficient speed/cost/reliability/power consumption ratios. It is expected that this machine can be built out of a rather limited number of processors (e.g. between 1000 and 4000), although capable of a sustained petaflop CPU performance. The proof-of-concept should be a mock-up cluster built as much as possible with off-the-shelf components, and 2 particularly attractive axis will be mainly investigated, in addition to fast all-purpose multi-core processors: the use of the new brand of IBM-Cell processors (with on-chip accelerators) and the very recent Nvidia GP-GPUs (off-chip co-processors). This cluster will obviously be massively parallel, and heterogeneous. Communication issues between processors, implied by the Physics of the simulation and the lattice partitioning, will certainly be a major key to the project.",
    	affiliation = "Laboratoire de Physique Subatomique et de Cosmologie - LPSC - CNRS: UMR5821 - IN2P3 - Universit{\'e} Joseph Fourier - Grenoble I - Institut Polytechnique de Grenoble Mod{\'e}lisation - PRISM - CNRS : UMR8144 - Universit{\'e} de Versailles-Saint Quentin en Yvelines - Laboratoire de Physique Th{\'e}orique d'Orsay - LPT - CNRS : UMR8627 - Universit{\'e} Paris Sud - Paris XI - Laboratoire de l'Acc{\'e}l{\'e}rateur Lin{\'e}aire-LAL-CNRS : UMR8607 - IN2P3 -Universit{\'e} Paris Sud - ParisXI - Institut de Recherches sur les lois Fondamentales de l'Univers (ex DAPNIA) - IRFU - CEA : DSM/IRFU - ALF - INRIA - IRISA - INRIA-Universit{\'e} de Rennes I",
    	annote = "confint",
    	audience = "internationale",
    	doi = "10.1088/1742{\-}6596/219/5/052021",
    	hal_id = "in2p3{\-}00380246",
    	owner = "MOIS",
    	timestamp = "2011.07.25",
    	url = "http://hal.in2p3.fr/in2p3{\-}00380246/en/"
    }
    

PhdThesis

{bibtex}hpcnet_bib_2008_2013/hpcnet_Phdthesis_2010.bib{/bibtex}

Techreport

  1. Kinda Khawam, Marc Ibrahim, Johanne Cohen, Samer Lahoud and Samir Tohme. Individual vs. Global Radio Resource Management in a Hybrid Broadband Network. PRiSM - CNRS : UMR8144 - Université de Versailles-Saint Quentin en Yvelines - Ecole supérieure d'ingénieurs de Beyrouth -ESIB - Université Saint-Joseph- Beyrouth - ATNET - IRISA - Université de Rennes I - Institut National des Sciences Appliqu ées de Rennes - École normale supérieure de Cachan - ENS Cachan - CNRS: UMR6074, octobre 2010.
    Abstract Nowadays, with the abundance of diverse air interfaces in the same operating area, advanced Radio Resource Management (RRM) is vital to take advantage of the available system resources. In such a scenario, a mobile user will be able to connect concurrently to different wireless access networks. In this paper, we consider the downlink of a hybrid network with two broadband Radio Access Technologies (RAT): WiMAX and WiFi. Two approaches are proposed to load balance the traffic of every user between the two available RATs: an individual approach where mobile users selfishly strive to improve their performance and a global approach where resource allocation is made in a way to satisfy all mobile users. We devise for the individual approach a fully distributed resource management scheme portrayed as a non-cooperative game. We characterize the Nash equilibriums of the proposed RRM game and put forward a decentralized algorithm based on replicator dynamics to achieve those equilibriums. In the global approach, resources are assigned by the system in order to enhance global performances. For the two approaches, we show that after convergence, each user is connected to a single RAT which avoids costly traffic splitting between available RATs. URL BibTeX

    @techreport{KIC+10,
    	author = "Khawam, Kinda and Ibrahim, Marc and Cohen, Johanne and Lahoud, Samer and Tohme, Samir",
    	title = "Individual vs. {G}lobal {R}adio {R}esource {M}anagement in a {H}ybrid {B}roadband {N}etwork",
    	institution = "PRiSM - CNRS : UMR8144 - Universit{\'e} de Versailles-Saint Quentin en Yvelines - Ecole sup{\'e}rieure d'ing{\'e}nieurs de Beyrouth -ESIB - Universit{\'e} Saint-Joseph- Beyrouth - ATNET - IRISA - Universit{\'e} de Rennes I - Institut National des Sciences Appliqu {\'e}es de Rennes - {\'E}cole normale sup{\'e}rieure de Cachan - ENS Cachan - CNRS: UMR6074",
    	year = 2010,
    	month = "October",
    	abstract = "Nowadays, with the abundance of diverse air interfaces in the same operating area, advanced Radio Resource Management (RRM) is vital to take advantage of the available system resources. In such a scenario, a mobile user will be able to connect concurrently to different wireless access networks. In this paper, we consider the downlink of a hybrid network with two broadband Radio Access Technologies (RAT): WiMAX and WiFi. Two approaches are proposed to load balance the traffic of every user between the two available RATs: an individual approach where mobile users selfishly strive to improve their performance and a global approach where resource allocation is made in a way to satisfy all mobile users. We devise for the individual approach a fully distributed resource management scheme portrayed as a non-cooperative game. We characterize the Nash equilibriums of the proposed RRM game and put forward a decentralized algorithm based on replicator dynamics to achieve those equilibriums. In the global approach, resources are assigned by the system in order to enhance global performances. For the two approaches, we show that after convergence, each user is connected to a single RAT which avoids costly traffic splitting between available RATs.",
    	annote = "rapport",
    	file = "PI-1956.pdf:http\://hal.inria.fr/inria-00528575/PDF/ PI-1956.pdf:PDF",
    	hal_id = "inria-00528575",
    	keywords = "Non-cooperative game theory, non-linear optimisation, WiMAX, WiFi, 4G networks",
    	language = "Anglais",
    	owner = "MOIS",
    	pages = 9,
    	timestamp = "2011.07.25",
    	url = "http://hal.inria.fr/inria-00528575/en/"
    }
    
  2. Sid-Ahmed-Ali Touati, Julien Worms and Sebastien Briais. The Speedup Test. PRiSM-CNRS : UMR8144 - Université de Versailles Saint-Quentin en Yvelines - ALCHEMY - INRIA Saclay - Ile de France - INRIA - CNRS: UMR8623 - Université Paris Sud - Paris XI - Laboratoire de Mathématiques de Versailles - LM-Versailles - CNRS : UMR8100 - Université de Versailles Saint-Quentin en Yvelines, 2010. A software is included with the document: the software implements the speedup-test protocole..
    Abstract Numerous code optimisation methods are usually experimented by doing multiple observations of the initial and the optimised executions times in order to declare a speedup. Even with fixed input and execution environment, programs executions times vary in general. So hence different kinds of speedups may be reported: the speedup of the average execution time, the speedup of the minimal execution time, the speedup of the median, etc. Many published speedups in the literature are observations of a set of experiments. In order to improve the reproducibility of the experimental results, this technical report presents a rigorous statistical methodology regarding program performance analysis. We rely on well known statistical tests (Shapiro-wilk's test, Fisher's F-test, Student's t-test, Kolmogorov- Smirnov's test, Wilcoxon-Mann-Whitney's test) to study if the observed speedups are statistically significant or not. By fixing $0\frac12$, the probability that an individual execution of the optimised code is faster than the individual execution of the initial code. Our methodology defines a consistent improvement compared to the usual performance analysis method in high performance computing as in \citeJain:1991:ACS,lilja:book. We explain in each situation what are the hypothesis that must be checked to declare a correct risk level for the statistics. The Speedup-Test protocol certifying the observed speedups with rigorous statistics is implemented and distributed as an open source tool based on R software. URL BibTeX

    @techreport{TWBR10,
    	author = "Touati, Sid-Ahmed-Ali and Worms, Julien and Briais, Sebastien",
    	title = "The {S}peedup {T}est",
    	institution = "PRiSM-CNRS : UMR8144 - Universit{\'e} de Versailles Saint-Quentin en Yvelines - ALCHEMY - INRIA Saclay - Ile de France - INRIA - CNRS: UMR8623 - Universit{\'e} Paris Sud - Paris XI - Laboratoire de Math{\'e}matiques de Versailles - LM-Versailles - CNRS : UMR8100 - Universit{\'e} de Versailles Saint-Quentin en Yvelines",
    	year = 2010,
    	note = "A software is included with the document: the software implements the speedup-test protocole.",
    	abstract = "Numerous code optimisation methods are usually experimented by doing multiple observations of the initial and the optimised executions times in order to declare a speedup. Even with fixed input and execution environment, programs executions times vary in general. So hence different kinds of speedups may be reported: the speedup of the average execution time, the speedup of the minimal execution time, the speedup of the median, etc. Many published speedups in the literature are observations of a set of experiments. In order to improve the reproducibility of the experimental results, this technical report presents a rigorous statistical methodology regarding program performance analysis. We rely on well known statistical tests (Shapiro-wilk's test, Fisher's F-test, Student's t-test, Kolmogorov- Smirnov's test, Wilcoxon-Mann-Whitney's test) to study if the observed speedups are statistically significant or not. By fixing $0\frac{1}{2}$, the probability that an individual execution of the optimised code is faster than the individual execution of the initial code. Our methodology defines a consistent improvement compared to the usual performance analysis method in high performance computing as in \cite{Jain:1991:ACS,lilja:book}. We explain in each situation what are the hypothesis that must be checked to declare a correct risk level for the statistics. The Speedup-Test protocol certifying the observed speedups with rigorous statistics is implemented and distributed as an open source tool based on R software.",
    	annote = "rapport",
    	file = "SpeedupTestDocument.pdf:http\://hal.inria.fr/ inria-00443839/PDF/SpeedupTestDocument.pdf:PDF",
    	hal_id = "inria-00443839",
    	keywords = "Code optimisation, program performance evaluation and analysis, statistics",
    	language = "Anglais",
    	owner = "MOIS",
    	timestamp = "2011.07.25",
    	url = "http://hal.inria.fr/inria-00443839/en/"
    }
    
  3. Abdelhafid Mazouz, Sid-Ahmed-Ali Touati and Denis Barthou. Measuring and Analysing the Variations of Program Execution Times on Multicore Platforms: Case Study. PRiSM-CNRS : UMR8144 - Université de Versailles Saint-Quentin en Yvelines - ALCHEMY - INRIA Saclay - Ile de France - INRIA - CNRS: UMR8623 - Université Paris Sud - Paris XI - Laboratoire Bordelais de Recherche en Informatique - LaBRI - CNRS : UMR5800 - Université Sciences et Technologies - Bordeaux I - Ecole Nationale Supérieure d'Electronique,Informatique et Radiocommunications de Bordeaux - Université Victor Segalen - Bordeaux II, 2010.
    Abstract The recent growth in the number of precessing units in today's multicore processor architectures enables multiple threads to execute simultanesiouly achieving better performances by exploiting thread level parallelism. With the architectural complexity of these new state of the art designs, comes a need to better understand the interactions between the operating system layers, the applications and the underlying hardware platforms.The ability to characterise and to quantify those interactions can be useful in the processes of performance evaluation and analysis, compiler optimisations and operating system job scheduling allowing to achieve better performance stability, reproducibility and predictability.We consider in our study performances instability as variations in program execution times. While these variations are statistically insignificant for large sequential applications, we observe that parallel native OpenMP programs have less performance stability. Understanding the performance instability in current multicore architectures is even more complicated by the variety of factors and sources influencing the applications performances. URL BibTeX

    @techreport{MTBA10a,
    	author = "Mazouz, Abdelhafid and Touati, Sid-Ahmed-Ali and Barthou, Denis",
    	title = "Measuring and {A}nalysing the {V}ariations of {P}rogram {E}xecution {T}imes on {M}ulticore {P}latforms: {C}ase {S}tudy",
    	institution = "PRiSM-CNRS : UMR8144 - Universit{\'e} de Versailles Saint-Quentin en Yvelines - ALCHEMY - INRIA Saclay - Ile de France - INRIA - CNRS: UMR8623 - Universit{\'e} Paris Sud - Paris XI - Laboratoire Bordelais de Recherche en Informatique - LaBRI - CNRS : UMR5800 - Universit{\'e} Sciences et Technologies - Bordeaux I - Ecole Nationale Sup{\'e}rieure d'Electronique,Informatique et Radiocommunications de Bordeaux - Universit{\'e} Victor Segalen - Bordeaux II",
    	year = 2010,
    	month = "Sep.",
    	abstract = "The recent growth in the number of precessing units in today's multicore processor architectures enables multiple threads to execute simultanesiouly achieving better performances by exploiting thread level parallelism. With the architectural complexity of these new state of the art designs, comes a need to better understand the interactions between the operating system layers, the applications and the underlying hardware platforms.The ability to characterise and to quantify those interactions can be useful in the processes of performance evaluation and analysis, compiler optimisations and operating system job scheduling allowing to achieve better performance stability, reproducibility and predictability.We consider in our study performances instability as variations in program execution times. While these variations are statistically insignificant for large sequential applications, we observe that parallel native OpenMP programs have less performance stability. Understanding the performance instability in current multicore architectures is even more complicated by the variety of factors and sources influencing the applications performances.",
    	annote = "rapport",
    	file = "VarExecTime.pdf:http\://hal.inria.fr/inria-00514548/PDF/ VarExecTime.pdf:PDF",
    	hal_id = "inria-00514548",
    	keywords = "OpenMP, Multicore, Parallelism, Performance evaluation",
    	language = "Anglais",
    	owner = "MOIS",
    	pages = 36,
    	timestamp = "2011.07.25",
    	url = "http://hal.inria.fr/inria-00514548/en/"
    }
    
  4. Sebastien Briais, Sid-Ahmed-Ali Touati and Karine Deschinkel. Ensuring Lexicographic-Positive Data Dependence Graphs in the SIRA Framework. PRiSM-CNRS : UMR8144 - Université de Versailles Saint-Quentin en Yvelines - ALCHEMY - INRIA Saclay - Ile de France - INRIA - CNRS : UMR8623 - Université Paris Sud - Paris XI - Laboratoire d'Informatique de Franche-Comté - LIFC - Université de Franche-Comté: EA4269, 2010.
    Abstract Usual cyclic scheduling problems, such as software pipelining, deal with precedence constraints having non- negative latencies. This seems a natural way for modelling scheduling problems, since instructions delays are generally non-negative quantities. However, in some cases, we need to consider edges latencies that do not only model instructions latencies, but model other precedence constraints. For instance in register optimisation problems, a generic machine model can allow considering access delays into/from registers (VLIW, EPIC, DSP). In this case, edge latencies may be non-positive leading to a difficult scheduling problem in presence of resources constraints. This research report studies the problem of cyclic instruction scheduling with register requirement minimisation (without resources constraints). We show that pre-conditioning a data dependence graph (DDG) to satisfy register constraints before software pipelining under resources constraint s may create cycles with non-positive distances, resulted from the acceptance of non-positive edges latencies. Such DDG is called ıt non lexicographic positive because it does not define a to pological sort between the instructions instances: in other words, its full unrolling does not define an acyclic graph. As a compiler construction strategy, we cannot allow thecreation of cycles with non-positive distances during the compilation flow, because non lexicographic positive DDG does not guarantee the existence of a valid instruction schedule under resource constraints. This research report examines two strategies to avoid the creation of these problematic DDG cycles. A first strategy is reactive, it tolerates the creation of non-positive cycles in a first step, and if detected in a further check step, makes a backtrack to eliminate them. A second strategy is proactive, it prevents the creation of non-positive cycles in the DDG during the register minimisation process. Our extensive experiments on FFMPEG, MEDIABENCH, SPEC2000 and SPEC2006 benchmarks show that the reactive strategy is faster and works well in practice, but may require more registers than the proactive strategy. Consequently, the reactive strategy is a suitable working solution for compilation if the number of available architectural registers is already fixed and register minimisation is not necessary (just consume less registers than the available capacity). However, the proactive strategy, while more time consuming, is a better alternative for register requirement minimisation: this may be the case when dealing with reconfigurable architectures, i.e. when the nu mber of available architectural registers is defined posterior to the compilation of the application. URL BibTeX

    @techreport{BTDE10,
    	author = "Briais, Sebastien and Touati, Sid-Ahmed-Ali and Deschinkel, Karine",
    	title = "Ensuring Lexicographic-{P}ositive {D}ata {D}ependence {G}raphs in the {SIRA} {F}ramework",
    	institution = "PRiSM-CNRS : UMR8144 - Universit{\'e} de Versailles Saint-Quentin en Yvelines - ALCHEMY - INRIA Saclay - Ile de France - INRIA - CNRS : UMR8623 - Universit{\'e} Paris Sud - Paris XI - Laboratoire d'Informatique de Franche-Comt{\'e} - LIFC - Universit{\'e} de Franche-Comt{\'e}: EA4269",
    	year = 2010,
    	month = "Mar.",
    	abstract = "Usual cyclic scheduling problems, such as software pipelining, deal with precedence constraints having non- negative latencies. This seems a natural way for modelling scheduling problems, since instructions delays are generally non-negative quantities. However, in some cases, we need to consider edges latencies that do not only model instructions latencies, but model other precedence constraints. For instance in register optimisation problems, a generic machine model can allow considering access delays into/from registers (VLIW, EPIC, DSP). In this case, edge latencies may be non-positive leading to a difficult scheduling problem in presence of resources constraints. This research report studies the problem of cyclic instruction scheduling with register requirement minimisation (without resources constraints). We show that pre-conditioning a data dependence graph (DDG) to satisfy register constraints before software pipelining under resources constraint s may create cycles with non-positive distances, resulted from the acceptance of non-positive edges latencies. Such DDG is called {\it non lexicographic positive} because it does not define a to pological sort between the instructions instances: in other words, its full unrolling does not define an acyclic graph. As a compiler construction strategy, we cannot allow thecreation of cycles with non-positive distances during the compilation flow, because non lexicographic positive DDG does not guarantee the existence of a valid instruction schedule under resource constraints. This research report examines two strategies to avoid the creation of these problematic DDG cycles. A first strategy is reactive, it tolerates the creation of non-positive cycles in a first step, and if detected in a further check step, makes a backtrack to eliminate them. A second strategy is proactive, it prevents the creation of non-positive cycles in the DDG during the register minimisation process. Our extensive experiments on FFMPEG, MEDIABENCH, SPEC2000 and SPEC2006 benchmarks show that the reactive strategy is faster and works well in practice, but may require more registers than the proactive strategy. Consequently, the reactive strategy is a suitable working solution for compilation if the number of available architectural registers is already fixed and register minimisation is not necessary (just consume less registers than the available capacity). However, the proactive strategy, while more time consuming, is a better alternative for register requirement minimisation: this may be the case when dealing with reconfigurable architectures, i.e. when the nu mber of available architectural registers is defined posterior to the compilation of the application.",
    	annote = "rapport",
    	collaboration = "PRiSM-INRIA",
    	file = "_negcycle.pdf:http\://hal.inria.fr/inria-00452695/PDF/main\ \_report\\_negcycle.pdf:PDF",
    	hal_id = "inria-00452695",
    	keywords = "Compilation, Code optimisation, Register pressure, Cyclic instruction scheduling, Instruction level parallelism",
    	language = "Anglais",
    	owner = "MOIS",
    	timestamp = "2011.07.25",
    	url = "http://hal.inria.fr/inria-00452695/en/"
    }
    

 Imprimer 

Our website is protected by DMC Firewall!