<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>#NLP | datacraft</title>
	<atom:link href="https://datacraft.paris/tag/nlp/feed/" rel="self" type="application/rss+xml" />
	<link>https://datacraft.paris</link>
	<description>Club dedicated to data scientists and their company</description>
	<lastBuildDate>Thu, 05 Dec 2024 13:28:15 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.8.5</generator>

<image>
	<url>https://datacraft.paris/wp-content/uploads/2020/07/favicon.png</url>
	<title>#NLP | datacraft</title>
	<link>https://datacraft.paris</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>State of the art &#8211; Preserving Privacy in Deep Learning Models</title>
		<link>https://datacraft.paris/event/state-of-the-art-preserving-privacy-in-deep-learning-models/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=state-of-the-art-preserving-privacy-in-deep-learning-models</link>
		
		<dc:creator><![CDATA[datacraft]]></dc:creator>
		<pubDate>Wed, 19 Jun 2024 10:30:00 +0000</pubDate>
				<category><![CDATA[#AIDetection]]></category>
		<category><![CDATA[#Authentification]]></category>
		<category><![CDATA[#Copyright]]></category>
		<category><![CDATA[#LLM]]></category>
		<category><![CDATA[#NLP]]></category>
		<category><![CDATA[#Privacy]]></category>
		<category><![CDATA[#Watermarking]]></category>
		<guid isPermaLink="false">https://datacraft.paris/?post_type=tribe_events&#038;p=10947</guid>

					<description><![CDATA[Led by Tom Sander, META AI (FAIR)]]></description>
										<content:encoded><![CDATA[<div class="et_pb_section et_pb_section_0 et_section_regular" >
				
				
				
				
				
				
				<div class="et_pb_row et_pb_row_0">
				<div class="et_pb_column et_pb_column_4_4 et_pb_column_0  et_pb_css_mix_blend_mode_passthrough et-last-child">
				
				
				
				
				<div class="et_pb_module et_pb_code et_pb_code_0">
				
				
				
				
				<div class="et_pb_code_inner">            <div class="">

                <!--h2 style="text-align: center;">Etes-vous membre ?</h2>
                <div class="et_pb_button_module_wrapper et_pb_button_1_wrapper et_pb_button_alignment_center et_pb_module" style="text-align: center;">
                    <a class="no-arrow datacraft-btn et_pb_button et_pb_button_1 et_pb_bg_layout_light " href="https://datacraft.paris/tag/nlp/feed/?login=true">Je suis membre et je me connecte</a>
                </div>
                <div class="et_pb_button_module_wrapper et_pb_button_1_wrapper et_pb_button_alignment_center et_pb_module" style="text-align: center;    padding-bottom: 9px;">
                    <a class="no-arrow datacraft-btn et_pb_button et_pb_button_1 et_pb_bg_layout_light " href="https://datacraft.paris/tag/nlp/feed/?login=false">Je ne suis pas membre</a>
                </div-->


                <!--div class="et_pb_button_module_wrapper et_pb_button_1_wrapper et_pb_button_alignment_center et_pb_module" style="text-align: center; margin-bottom: 4px;">
                    <a class="no-arrow datacraft-btn et_pb_button et_pb_button_1 et_pb_bg_layout_light " href="https://www.eventbrite.co.uk/e/state-of-the-art-preserving-privacy-in-deep-learning-models-tickets-885525238827?aff=oddtdtcreator" style=" margin-bottom: 0px;">Inscription</a>
                </div-->
                <div class="et_pb_button_module_wrapper et_pb_button_1_wrapper et_pb_button_alignment_center et_pb_module" style="text-align: center; margin-bottom: 4px;">
                    <a class="no-arrow datacraft-btn et_pb_button et_pb_button_1 et_pb_bg_layout_light " href="https://datacraft.paris/tag/nlp/feed/?login=false">Inscription</a>
                </div>
            </div>
            </div>
			</div><div class="et_pb_module et_pb_text et_pb_text_0  et_pb_text_align_left et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><div>This state of the art will be led by Tom Sander, META AI (FAIR)</div>
<div> </div>
<div>
<p>Deep Learning models are known for their capacity to memorize training data, a characteristic that can lead to significant privacy concerns.</p>
<p>In this talk, we will delve into the implications of this memorization and explore strategies to mitigate its effects. We will discuss practical methods to reduce unintended memorization and their results, providing insights into how these techniques can be deployed in real-world scenarios.</p>
<p>Tom’s work: <a href="https://scholar.google.com/citations?user=xrewx-sAAAAJ&amp;hl=en" target="_blank" rel="noopener" data-saferedirecturl="https://www.google.com/url?q=https://scholar.google.com/citations?user%3Dxrewx-sAAAAJ%26hl%3Den&amp;source=gmail&amp;ust=1712129607862000&amp;usg=AOvVaw36N8Qf99KKRoN1dYRVmRR4">https://scholar.google.com/<wbr />citations?user=xrewx-sAAAAJ&amp;<wbr />hl=en</a></p>
<p><b>Audience</b>: technique</p>
</div>
<div> </div></div>
			</div>
			</div>
				
				
				
				
			</div>
				
				
			</div>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>State of the art &#8211; Practical insights for LLM fine-tuning and evaluation</title>
		<link>https://datacraft.paris/event/etat-de-lart-from-agi-promises-to-llm-realities-practical-insights-into-language-model-fine-tuning-and-evaluation/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=etat-de-lart-from-agi-promises-to-llm-realities-practical-insights-into-language-model-fine-tuning-and-evaluation</link>
		
		<dc:creator><![CDATA[datacraft]]></dc:creator>
		<pubDate>Wed, 28 Feb 2024 17:00:00 +0000</pubDate>
				<category><![CDATA[#AGI]]></category>
		<category><![CDATA[#Finetuning]]></category>
		<category><![CDATA[#GenAI]]></category>
		<category><![CDATA[#LLM]]></category>
		<category><![CDATA[#LoRA]]></category>
		<category><![CDATA[#NLP]]></category>
		<guid isPermaLink="false">https://datacraft.paris/?post_type=tribe_events&#038;p=9887</guid>

					<description><![CDATA[State of the art led by François Paupier, machine learning engineer, fpaupier engineering services]]></description>
										<content:encoded><![CDATA[<div class="et_pb_section et_pb_section_1 et_section_regular" >
				
				
				
				
				
				
				<div class="et_pb_row et_pb_row_1">
				<div class="et_pb_column et_pb_column_4_4 et_pb_column_1  et_pb_css_mix_blend_mode_passthrough et-last-child">
				
				
				
				
				<div class="et_pb_module et_pb_code et_pb_code_1 et_clickable">
				
				
				
				
				<div class="et_pb_code_inner">            <div class="">

                <!--h2 style="text-align: center;">Etes-vous membre ?</h2>
                <div class="et_pb_button_module_wrapper et_pb_button_1_wrapper et_pb_button_alignment_center et_pb_module" style="text-align: center;">
                    <a class="no-arrow datacraft-btn et_pb_button et_pb_button_1 et_pb_bg_layout_light " href="https://datacraft.paris/tag/nlp/feed/?login=true">Je suis membre et je me connecte</a>
                </div>
                <div class="et_pb_button_module_wrapper et_pb_button_1_wrapper et_pb_button_alignment_center et_pb_module" style="text-align: center;    padding-bottom: 9px;">
                    <a class="no-arrow datacraft-btn et_pb_button et_pb_button_1 et_pb_bg_layout_light " href="https://datacraft.paris/tag/nlp/feed/?login=false">Je ne suis pas membre</a>
                </div-->


                <!--div class="et_pb_button_module_wrapper et_pb_button_1_wrapper et_pb_button_alignment_center et_pb_module" style="text-align: center; margin-bottom: 4px;">
                    <a class="no-arrow datacraft-btn et_pb_button et_pb_button_1 et_pb_bg_layout_light " href="https://www.eventbrite.co.uk/e/billets-llm-promises-vs-realities-practical-insights-on-fine-tuning-and-evaluation-816678767177" style=" margin-bottom: 0px;">Inscription</a>
                </div-->
                <div class="et_pb_button_module_wrapper et_pb_button_1_wrapper et_pb_button_alignment_center et_pb_module" style="text-align: center; margin-bottom: 4px;">
                    <a class="no-arrow datacraft-btn et_pb_button et_pb_button_1 et_pb_bg_layout_light " href="https://datacraft.paris/tag/nlp/feed/?login=false">Inscription</a>
                </div>
            </div>
            </div>
			</div><div class="et_pb_module et_pb_text et_pb_text_1  et_pb_text_align_left et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><p><strong>[The workshop will be dispense in english ] </strong></p>
<p><strong>Machine Learning Level :</strong></p>
<p><strong>**</strong>Good knowledge of Machine Learning</p>
<p><strong> Python Level</strong></p>
<p><strong>*</strong>Basic skills in Python</p>
<p><strong>Workshop description</strong></p>
<p>If vendors almost announce to sell AGI-as-as-service through API to enterprise client, the reality after trying to use proprietary LLM as a service on a specific use case is often different. Non relevant generation, out-of-context answer, misunderstanding of the user queries and more broadly lacking subject matter expertise starts to erode the users and shareholders confidence of the potential transformative power of deploying LLM-enhanced business workflow across your organisation. You&#8217;re not alone in this journey.</p>
<p>In this talk, we’ll explore the landscape of fine-tuning solutions for open-source LLM, weighing their pros and cons. We&#8217;ll delve into the data required and how to design a robust evaluation framework to systematically assess your in-house model&#8217;s performance.</p>
<p>We’ll deep dive on the subtle differences between the Parameter Efficient Finetuning Methods PEFT), the reinforcement learning approaches, what to keep in mind when considering which one to use.</p>
<p>This talk is a synthesis of deploying LLM capabilities at various organisations, from startup to corporate environments. It&#8217;s a blend of insights from research papers and pragmatic experiences. We won’t go onto the details of the mathematical operations under the hood for each fine-tuning approach, instead our goal is to share the intuition of those concepts, equipping you to design an effective roadmap for fine-tuning an LLM for your specific business use case.</p>
<p>Slides in pdf will be made available for free on the speaker twitter at the end of the talk @fpaupier.</p>
<p><strong>Intervenants :</strong></p>
<p><a title="https://www.linkedin.com/in/fpaupier/" href="https://www.linkedin.com/in/fpaupier/" target="_blank" rel="nofollow noopener noreferrer">François Paupier</a>, machine learning engineer, fpaupier engineering services</p>
<p>&nbsp;</p></div>
			</div><div class="et_pb_module et_pb_video et_pb_video_0">
				
				
				
				
				<div class="et_pb_video_box"><iframe title="From AGI Promises to Realities: Practical insights into language model fine-tuning and evaluation" width="1080" height="608" src="https://www.youtube.com/embed/GrjWT-rCK0I?feature=oembed"  allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe></div>
				
			</div>
			</div>
				
				
				
				
			</div>
				
				
			</div>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>REX &#8211; ANALYSER LES DOCUMENTS CLINIQUES À L’AIDE D’ALGORITHMES DE TRAITEMENT AUTOMATIQUE DU LANGAGE  : QUELQUES CAS PRATIQUES</title>
		<link>https://datacraft.paris/event/rex-analyser-les-documents-cliniques-a-laide-dalgorithmes-de-traitement-automatique-du-langage-quelques-cas-pratiques/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=rex-analyser-les-documents-cliniques-a-laide-dalgorithmes-de-traitement-automatique-du-langage-quelques-cas-pratiques</link>
		
		<dc:creator><![CDATA[datacraft]]></dc:creator>
		<pubDate>Tue, 21 Mar 2023 18:00:00 +0000</pubDate>
				<category><![CDATA[#Health]]></category>
		<category><![CDATA[#NLP]]></category>
		<guid isPermaLink="false">https://datacraft.paris/?post_type=tribe_events&#038;p=8139</guid>

					<description><![CDATA[C﻿et atelier sera présenté par Romain Bey, Dr. Emmanuelle Kempf, Thomas Petit-Jean, Dr.Christel Gérardin, Perceval Wajsbürt , APHP]]></description>
										<content:encoded><![CDATA[<div class="et_pb_section et_pb_section_2 et_section_regular" >
				
				
				
				
				
				
				<div class="et_pb_row et_pb_row_2">
				<div class="et_pb_column et_pb_column_4_4 et_pb_column_2  et_pb_css_mix_blend_mode_passthrough et-last-child">
				
				
				
				
				<div class="et_pb_module et_pb_code et_pb_code_2">
				
				
				
				
				<div class="et_pb_code_inner">            <div class="">

                <!--h2 style="text-align: center;">Etes-vous membre ?</h2>
                <div class="et_pb_button_module_wrapper et_pb_button_1_wrapper et_pb_button_alignment_center et_pb_module" style="text-align: center;">
                    <a class="no-arrow datacraft-btn et_pb_button et_pb_button_1 et_pb_bg_layout_light " href="https://datacraft.paris/tag/nlp/feed/?login=true">Je suis membre et je me connecte</a>
                </div>
                <div class="et_pb_button_module_wrapper et_pb_button_1_wrapper et_pb_button_alignment_center et_pb_module" style="text-align: center;    padding-bottom: 9px;">
                    <a class="no-arrow datacraft-btn et_pb_button et_pb_button_1 et_pb_bg_layout_light " href="https://datacraft.paris/tag/nlp/feed/?login=false">Je ne suis pas membre</a>
                </div-->


                <!--div class="et_pb_button_module_wrapper et_pb_button_1_wrapper et_pb_button_alignment_center et_pb_module" style="text-align: center; margin-bottom: 4px;">
                    <a class="no-arrow datacraft-btn et_pb_button et_pb_button_1 et_pb_bg_layout_light " href="https://www.eventbrite.co.uk/e/billets-analyser-les-documents-cliniques-a-laide-dalgorithmes-de-nlp-558697608587" style=" margin-bottom: 0px;">Inscription</a>
                </div-->
                <div class="et_pb_button_module_wrapper et_pb_button_1_wrapper et_pb_button_alignment_center et_pb_module" style="text-align: center; margin-bottom: 4px;">
                    <a class="no-arrow datacraft-btn et_pb_button et_pb_button_1 et_pb_bg_layout_light " href="https://datacraft.paris/tag/nlp/feed/?login=false">Inscription</a>
                </div>
            </div>
            </div>
			</div><div class="et_pb_module et_pb_text et_pb_text_2  et_pb_text_align_left et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><p><b>Difficulté<br /></b><span style="font-weight: 400;">* : Connaissances de base en ML/Data/IA</span></p>
<p><span style="font-weight: 400;"><br /></span><b>Prérequis techniques<br /></b>Aucun</p>
<p><b>Intervenants<br /></b><span style="font-weight: 400;">Cet atelier sera présenté par : </span></p>
<ul>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;"><a href="https://www.linkedin.com/in/romain-bey-7b233889/">Romain Bey</a>, responsable de l&#8217;équipe Sciences des Données, Direction des Services Numériques de l&#8217;AP-HP </span></li>
<li style="font-weight: 400;" aria-level="1">Dr. Emmanuelle Kempf, oncologue à l&#8217;hôpital Henri Mondor, référente IA et Cancer</li>
<li aria-level="1">Thomas Petit-Jean, data scientist, Direction des Services Numériques de l&#8217;AP-HP</li>
<li aria-level="1"><a href="https://www.linkedin.com/in/christel-g%C3%A9rardin-5a8206207/">Dr. Christel Gérardin</a>, médecin interniste, actuellement en thèse INSERM</li>
<li aria-level="1"><a href="https://www.linkedin.com/in/percevalw/">Perceval Wajsbürt</a>, data scientist et docteur en traitement automatique du langage sur données cliniques, Direction des Services Numériques de l&#8217;AP-HP</li>
</ul>
<p>&nbsp;</p>
<p><b>Présentation de l&#8217;événement<br /></b>L’IA permet d&#8217;analyser automatiquement des millions de documents cliniques contenus dans les entrepôts de données de santé (EDS), pour réaliser des recherches et contribuer au pilotage des établissements de santé. Cette présentation portera sur le développement et l&#8217;utilisation d’algorithmes de traitement automatique du langage (TAL) dans le cadre de l’EDS de l’AP-HP. Elle sera divisée en 3 séquences :</p>
<p>Développement, validation et utilisation des algorithmes de TAL clinique dans le contexte d’un EDS : quelles sont les compétences et les technologies nécessaires ? Comment utiliser des algorithmes préexistants ? Comment contribuer à leur amélioration ? (Intervenant : Romain Bey)<br />Cas d’usage : algorithmes extrayant les comorbidités du score Charlson mentionnées dans les comptes rendus (Intervenants : Dr. Emmanuelle Kempf, Thomas Petit-Jean)<br />Cas d’usage : algorithme détectant automatiquement la mise en page et la structuration en sections des documents cliniques afin de mieux contextualiser les mentions ainsi détectées (Intervenants : Dr. Christel Gérardin, Perceval Wajsbürt)</p></div>
			</div>
			</div>
				
				
				
				
			</div>
				
				
			</div>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>[ÉVÉNEMENT REPORTÉ] REX &#8211; Analyser les documents cliniques à l’aide d’algorithmes de traitement automatique du langage : quelques exemples dans le cadre d’un entrepôt de données de santé hospitalier</title>
		<link>https://datacraft.paris/event/analyser-les-documents-cliniques-a-laide-dalgorithmes-de-tal/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=analyser-les-documents-cliniques-a-laide-dalgorithmes-de-tal</link>
		
		<dc:creator><![CDATA[datacraft]]></dc:creator>
		<pubDate>Thu, 16 Feb 2023 18:00:00 +0000</pubDate>
				<category><![CDATA[#Health]]></category>
		<category><![CDATA[#NLP]]></category>
		<guid isPermaLink="false">https://datacraft.paris/?post_type=tribe_events&#038;p=7880</guid>

					<description><![CDATA[Cet atelier sera animé par une équipe de l'AP-HP]]></description>
										<content:encoded><![CDATA[<div class="et_pb_section et_pb_section_3 et_section_regular" >
				
				
				
				
				
				
				<div class="et_pb_row et_pb_row_3">
				<div class="et_pb_column et_pb_column_4_4 et_pb_column_3  et_pb_css_mix_blend_mode_passthrough et-last-child">
				
				
				
				
				<div class="et_pb_module et_pb_code et_pb_code_3">
				
				
				
				
				
			</div><div class="et_pb_module et_pb_text et_pb_text_3  et_pb_text_align_left et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><p>L’IA permet d&#8217;analyser automatiquement des millions de documents cliniques contenus dans les entrepôts de données de santé (EDS), pour réaliser des recherches et contribuer au pilotage des établissements de santé. Cette présentation portera sur le développement et l&#8217;utilisation d’algorithmes de traitement automatique du langage (TAL) dans le cadre de l’EDS de l’AP-HP. Elle sera divisée en 3 séquences :</p>
<p>Développement, validation et utilisation des algorithmes de TAL clinique dans le contexte d’un EDS : quelles sont les compétences et les technologies nécessaires ? Comment utiliser des algorithmes préexistants ? Comment contribuer à leur amélioration ? (Intervenant : Romain Bey)<br />Cas d’usage : algorithmes extrayant les comorbidités du score Charlson mentionnées dans les comptes rendus (Intervenants : Dr. Emmanuelle Kempf, Thomas Petit-Jean)<br />Cas d’usage : algorithme détectant automatiquement la mise en page et la structuration en sections des documents cliniques afin de mieux contextualiser les mentions ainsi détectées (Intervenants : Dr. Christel Gérardin, Perceval Wajsbürt)</p></div>
			</div>
			</div>
				
				
				
				
			</div>
				
				
			</div>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
