Compare commits

..

47 Commits

Author SHA1 Message Date
ganshihao21 3c589f05c3 修复测试snowboydecoder.py中的模拟对象创建问题
5 months ago
ganshihao21 3aee067a00 1
5 months ago
ganshihao21 b7dd02ba83 修复获取图片路径错误
5 months ago
ganshihao21 5896205fc4 修改fetchToken.py,修复检查audio_tts_post错误
5 months ago
ganshihao21 6d54f48d8d 1
5 months ago
qinxiaonan_branch 998de651f7 最后修改
5 months ago
qinxiaonan_branch 1f8df283bc 重新上传
5 months ago
Gehusileng fc70493fa6 1
5 months ago
Gehusileng 9be37a1750 1
5 months ago
qinxiaonan_branch a3b08d98c6 ppt更改
5 months ago
qinxiaonan_branch ce5652c8e2 Merge branch 'master' of https://bdgit.educoder.net/pcffgoxks/daomangzhang
5 months ago
qinxiaonan_branch 240a35ca2c ppt更改
5 months ago
liuyitao 578197dc3b 增加运行项目代码和更改voice文件
5 months ago
qinxiaonan_branch b705dba19c 整体修改
5 months ago
qinxiaonan_branch a364500164 window.py的修改
5 months ago
ganshihao21 574b17ef95 上传开源代码文件
5 months ago
ganshihao21 3a599798e2 1
5 months ago
ganshihao21 7f55d88d55 开源代码上传
5 months ago
qinxiaonan_branch ac19e2a08e 个人自评报告修改
5 months ago
Gehusileng 810a26d0df Merge branch 'master' of https://bdgit.educoder.net/pcffgoxks/daomangzhang
5 months ago
Gehusileng 338080654f 项目总代码
5 months ago
qinxiaonan_branch 56e71b2fb0 train.py的修改
5 months ago
qinxiaonan_branch 6316da5942 修改1
5 months ago
ganshihao21 49f13f4834 新增测试用例以检测语音助手唤醒词
5 months ago
ganshihao21 5db191d534 1
5 months ago
ganshihao21 87c323da9c 1
5 months ago
qinxiaonan_branch 57b6dcbdc1 ppt更改
5 months ago
qinxiaonan_branch 55de79e11b 宣传彩页和汇报视频
5 months ago
qinxiaonan_branch 153e2baa6c 0
5 months ago
qinxiaonan_branch bc5c74c47f 汇报PPT
5 months ago
qinxiaonan_branch 3f3185dc77 个人文档更改2
5 months ago
qinxiaonan_branch dee3e830fc 代码更改1
5 months ago
qinxiaonan_branch 286ef3f0ba 个人文档1
5 months ago
qinxiaonan_branch 7e01a24520 1
5 months ago
qinxiaonan_branch d1dc958ef8 Merge branch 'master' of https://bdgit.educoder.net/pcffgoxks/daomangzhang
6 months ago
qinxiaonan_branch 20c9ef3788 自评报告
6 months ago
ganshihao 4fa401aeca 111
6 months ago
ganshihao 30dbb4d7e0 实现语音转文本,语音合成功能
6 months ago
ganshihao 256f06f9e7 1
6 months ago
pcffgoxks 71da3a26a4 Merge pull request '完成语音合成与语音识别功能' (#6) from ganshihao_branch into master
6 months ago
pcffgoxks 6f089a70b6 Merge pull request '离线语音合成' (#5) from liuyitao_branch into master
6 months ago
ganshihao 28400a065f 提交语音识别合成功能代码
7 months ago
qinxiaonan_branch 4c674d5452 1
7 months ago
Gehusileng 50004adde5 修改README
7 months ago
Gehusileng a5c0210b0c 图像识别模块源代码
7 months ago
Gehusileng 1c065f3ba0 2
7 months ago
Gehusileng da470aca42 1
7 months ago

Binary file not shown.

After

Width:  |  Height:  |  Size: 591 KiB

Binary file not shown.

@ -0,0 +1 @@
Subproject commit 64a5c7595d7404f78caf82348af9ae30b105f5b4

@ -0,0 +1,674 @@
GNU GENERAL PUBLIC LICENSE
Version 3, 29 June 2007
Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The GNU General Public License is a free, copyleft license for
software and other kinds of works.
The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
the GNU General Public License is intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users. We, the Free Software Foundation, use the
GNU General Public License for most of our software; it applies also to
any other work released this way by its authors. You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.
To protect your rights, we need to prevent others from denying you
these rights or asking you to surrender the rights. Therefore, you have
certain responsibilities if you distribute copies of the software, or if
you modify it: responsibilities to respect the freedom of others.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must pass on to the recipients the same
freedoms that you received. You must make sure that they, too, receive
or can get the source code. And you must show them these terms so they
know their rights.
Developers that use the GNU GPL protect your rights with two steps:
(1) assert copyright on the software, and (2) offer you this License
giving you legal permission to copy, distribute and/or modify it.
For the developers' and authors' protection, the GPL clearly explains
that there is no warranty for this free software. For both users' and
authors' sake, the GPL requires that modified versions be marked as
changed, so that their problems will not be attributed erroneously to
authors of previous versions.
Some devices are designed to deny users access to install or run
modified versions of the software inside them, although the manufacturer
can do so. This is fundamentally incompatible with the aim of
protecting users' freedom to change the software. The systematic
pattern of such abuse occurs in the area of products for individuals to
use, which is precisely where it is most unacceptable. Therefore, we
have designed this version of the GPL to prohibit the practice for those
products. If such problems arise substantially in other domains, we
stand ready to extend this provision to those domains in future versions
of the GPL, as needed to protect the freedom of users.
Finally, every program is threatened constantly by software patents.
States should not allow patents to restrict development and use of
software on general-purpose computers, but in those that do, we wish to
avoid the special danger that patents applied to a free program could
make it effectively proprietary. To prevent this, the GPL assures that
patents cannot be used to render the program non-free.
The precise terms and conditions for copying, distribution and
modification follow.
TERMS AND CONDITIONS
0. Definitions.
"This License" refers to version 3 of the GNU General Public License.
"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
"The Program" refers to any copyrightable work licensed under this
License. Each licensee is addressed as "you". "Licensees" and
"recipients" may be individuals or organizations.
To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy. The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.
A "covered work" means either the unmodified Program or a work based
on the Program.
To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy. Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.
To "convey" a work means any kind of propagation that enables other
parties to make or receive copies. Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License. If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.
1. Source Code.
The "source code" for a work means the preferred form of the work
for making modifications to it. "Object code" means any non-source
form of a work.
A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.
The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form. A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.
The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities. However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work. For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.
The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.
The Corresponding Source for a work in source code form is that
same work.
2. Basic Permissions.
All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met. This License explicitly affirms your unlimited
permission to run the unmodified Program. The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work. This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.
You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force. You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright. Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.
Conveying under any other circumstances is permitted solely under
the conditions stated below. Sublicensing is not allowed; section 10
makes it unnecessary.
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.
When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.
4. Conveying Verbatim Copies.
You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.
5. Conveying Modified Source Versions.
You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:
a) The work must carry prominent notices stating that you modified
it, and giving a relevant date.
b) The work must carry prominent notices stating that it is
released under this License and any conditions added under section
7. This requirement modifies the requirement in section 4 to
"keep intact all notices".
c) You must license the entire work, as a whole, under this
License to anyone who comes into possession of a copy. This
License will therefore apply, along with any applicable section 7
additional terms, to the whole of the work, and all its parts,
regardless of how they are packaged. This License gives no
permission to license the work in any other way, but it does not
invalidate such permission if you have separately received it.
d) If the work has interactive user interfaces, each must display
Appropriate Legal Notices; however, if the Program has interactive
interfaces that do not display Appropriate Legal Notices, your
work need not make them do so.
A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit. Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.
6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:
a) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by the
Corresponding Source fixed on a durable physical medium
customarily used for software interchange.
b) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by a
written offer, valid for at least three years and valid for as
long as you offer spare parts or customer support for that product
model, to give anyone who possesses the object code either (1) a
copy of the Corresponding Source for all the software in the
product that is covered by this License, on a durable physical
medium customarily used for software interchange, for a price no
more than your reasonable cost of physically performing this
conveying of source, or (2) access to copy the
Corresponding Source from a network server at no charge.
c) Convey individual copies of the object code with a copy of the
written offer to provide the Corresponding Source. This
alternative is allowed only occasionally and noncommercially, and
only if you received the object code with such an offer, in accord
with subsection 6b.
d) Convey the object code by offering access from a designated
place (gratis or for a charge), and offer equivalent access to the
Corresponding Source in the same way through the same place at no
further charge. You need not require recipients to copy the
Corresponding Source along with the object code. If the place to
copy the object code is a network server, the Corresponding Source
may be on a different server (operated by you or a third party)
that supports equivalent copying facilities, provided you maintain
clear directions next to the object code saying where to find the
Corresponding Source. Regardless of what server hosts the
Corresponding Source, you remain obligated to ensure that it is
available for as long as needed to satisfy these requirements.
e) Convey the object code using peer-to-peer transmission, provided
you inform other peers where the object code and Corresponding
Source of the work are being offered to the general public at no
charge under subsection 6d.
A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.
A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling. In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage. For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product. A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.
"Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source. The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.
If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information. But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).
The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed. Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.
7. Additional Terms.
"Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law. If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it. (Additional permissions may be written to require their own
removal in certain cases when you modify the work.) You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:
a) Disclaiming warranty or limiting liability differently from the
terms of sections 15 and 16 of this License; or
b) Requiring preservation of specified reasonable legal notices or
author attributions in that material or in the Appropriate Legal
Notices displayed by works containing it; or
c) Prohibiting misrepresentation of the origin of that material, or
requiring that modified versions of such material be marked in
reasonable ways as different from the original version; or
d) Limiting the use for publicity purposes of names of licensors or
authors of the material; or
e) Declining to grant rights under trademark law for use of some
trade names, trademarks, or service marks; or
f) Requiring indemnification of licensors and authors of that
material by anyone who conveys the material (or modified versions of
it) with contractual assumptions of liability to the recipient, for
any liability that these contractual assumptions directly impose on
those licensors and authors.
All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10. If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term. If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.
8. Termination.
You may not propagate or modify a covered work except as expressly
provided under this License. Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).
However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License. If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.
9. Acceptance Not Required for Having Copies.
You are not required to accept this License in order to receive or
run a copy of the Program. Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance. However,
nothing other than this License grants you permission to propagate or
modify any covered work. These actions infringe copyright if you do
not accept this License. Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.
10. Automatic Licensing of Downstream Recipients.
Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License. You are not responsible
for enforcing compliance by third parties with this License.
An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations. If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License. For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.
11. Patents.
A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based. The
work thus licensed is called the contributor's "contributor version".
A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version. For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.
In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement). To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.
If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients. "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.
A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License. You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.
12. No Surrender of Others' Freedom.
If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all. For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.
13. Use with the GNU Affero General Public License.
Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU Affero General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
but the special requirements of the GNU Affero General Public License,
section 13, concerning interaction through a network will apply to the
combination as such.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of
the GNU General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the
Program specifies that a certain numbered version of the GNU General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
GNU General Public License, you may choose any version ever published
by the Free Software Foundation.
If the Program specifies that a proxy can decide which future
versions of the GNU General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.
Later license versions may give you additional or different
permissions. However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.
15. Disclaimer of Warranty.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. Limitation of Liability.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.
17. Interpretation of Sections 15 and 16.
If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
Also add information on how to contact you by electronic and paper mail.
If the program does terminal interaction, make it output a short
notice like this when it starts in an interactive mode:
<program> Copyright (C) <year> <name of author>
This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, your program's commands
might be different; for a GUI interface, you would use an "about box".
You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU GPL, see
<http://www.gnu.org/licenses/>.
The GNU General Public License does not permit incorporating your program
into proprietary programs. If your program is a subroutine library, you
may consider it more useful to permit linking proprietary applications with
the library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License. But first, please read
<http://www.gnu.org/philosophy/why-not-lgpl.html>.

@ -0,0 +1,67 @@
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Argoverse-HD dataset (ring-front-center camera) http://www.cs.cmu.edu/~mengtial/proj/streaming/
# Example usage: python train.py --data Argoverse.yaml
# parent
# ├── yolov5
# └── datasets
# └── Argoverse ← downloads here
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/Argoverse # dataset root dir
train: Argoverse-1.1/images/train/ # train images (relative to 'path') 39384 images
val: Argoverse-1.1/images/val/ # val images (relative to 'path') 15062 images
test: Argoverse-1.1/images/test/ # test images (optional) https://eval.ai/web/challenges/challenge-page/800/overview
# Classes
nc: 8 # number of classes
names: ['person', 'bicycle', 'car', 'motorcycle', 'bus', 'truck', 'traffic_light', 'stop_sign'] # class names
# Download script/URL (optional) ---------------------------------------------------------------------------------------
download: |
import json
from tqdm import tqdm
from utils.general import download, Path
def argoverse2yolo(set):
labels = {}
a = json.load(open(set, "rb"))
for annot in tqdm(a['annotations'], desc=f"Converting {set} to YOLOv5 format..."):
img_id = annot['image_id']
img_name = a['images'][img_id]['name']
img_label_name = img_name[:-3] + "txt"
cls = annot['category_id'] # instance class id
x_center, y_center, width, height = annot['bbox']
x_center = (x_center + width / 2) / 1920.0 # offset and scale
y_center = (y_center + height / 2) / 1200.0 # offset and scale
width /= 1920.0 # scale
height /= 1200.0 # scale
img_dir = set.parents[2] / 'Argoverse-1.1' / 'labels' / a['seq_dirs'][a['images'][annot['image_id']]['sid']]
if not img_dir.exists():
img_dir.mkdir(parents=True, exist_ok=True)
k = str(img_dir / img_label_name)
if k not in labels:
labels[k] = []
labels[k].append(f"{cls} {x_center} {y_center} {width} {height}\n")
for k in labels:
with open(k, "w") as f:
f.writelines(labels[k])
# Download
dir = Path('../datasets/Argoverse') # dataset root dir
urls = ['https://argoverse-hd.s3.us-east-2.amazonaws.com/Argoverse-HD-Full.zip']
download(urls, dir=dir, delete=False)
# Convert
annotations_dir = 'Argoverse-HD/annotations/'
(dir / 'Argoverse-1.1' / 'tracking').rename(dir / 'Argoverse-1.1' / 'images') # rename 'tracking' to 'images'
for d in "train.json", "val.json":
argoverse2yolo(dir / annotations_dir / d) # convert VisDrone annotations to YOLO labels

@ -0,0 +1,53 @@
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Global Wheat 2020 dataset http://www.global-wheat.com/
# Example usage: python train.py --data GlobalWheat2020.yaml
# parent
# ├── yolov5
# └── datasets
# └── GlobalWheat2020 ← downloads here
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/GlobalWheat2020 # dataset root dir
train: # train images (relative to 'path') 3422 images
- images/arvalis_1
- images/arvalis_2
- images/arvalis_3
- images/ethz_1
- images/rres_1
- images/inrae_1
- images/usask_1
val: # val images (relative to 'path') 748 images (WARNING: train set contains ethz_1)
- images/ethz_1
test: # test images (optional) 1276 images
- images/utokyo_1
- images/utokyo_2
- images/nau_1
- images/uq_1
# Classes
nc: 1 # number of classes
names: ['wheat_head'] # class names
# Download script/URL (optional) ---------------------------------------------------------------------------------------
download: |
from utils.general import download, Path
# Download
dir = Path(yaml['path']) # dataset root dir
urls = ['https://zenodo.org/record/4298502/files/global-wheat-codalab-official.zip',
'https://github.com/ultralytics/yolov5/releases/download/v1.0/GlobalWheat2020_labels.zip']
download(urls, dir=dir)
# Make Directories
for p in 'annotations', 'images', 'labels':
(dir / p).mkdir(parents=True, exist_ok=True)
# Move
for p in 'arvalis_1', 'arvalis_2', 'arvalis_3', 'ethz_1', 'rres_1', 'inrae_1', 'usask_1', \
'utokyo_1', 'utokyo_2', 'nau_1', 'uq_1':
(dir / p).rename(dir / 'images' / p) # move to /images
f = (dir / p).with_suffix('.json') # json file
if f.exists():
f.rename((dir / 'annotations' / p).with_suffix('.json')) # move to /annotations

@ -0,0 +1,112 @@
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Objects365 dataset https://www.objects365.org/
# Example usage: python train.py --data Objects365.yaml
# parent
# ├── yolov5
# └── datasets
# └── Objects365 ← downloads here
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/Objects365 # dataset root dir
train: images/train # train images (relative to 'path') 1742289 images
val: images/val # val images (relative to 'path') 80000 images
test: # test images (optional)
# Classes
nc: 365 # number of classes
names: ['Person', 'Sneakers', 'Chair', 'Other Shoes', 'Hat', 'Car', 'Lamp', 'Glasses', 'Bottle', 'Desk', 'Cup',
'Street Lights', 'Cabinet/shelf', 'Handbag/Satchel', 'Bracelet', 'Plate', 'Picture/Frame', 'Helmet', 'Book',
'Gloves', 'Storage box', 'Boat', 'Leather Shoes', 'Flower', 'Bench', 'Potted Plant', 'Bowl/Basin', 'Flag',
'Pillow', 'Boots', 'Vase', 'Microphone', 'Necklace', 'Ring', 'SUV', 'Wine Glass', 'Belt', 'Monitor/TV',
'Backpack', 'Umbrella', 'Traffic Light', 'Speaker', 'Watch', 'Tie', 'Trash bin Can', 'Slippers', 'Bicycle',
'Stool', 'Barrel/bucket', 'Van', 'Couch', 'Sandals', 'Basket', 'Drum', 'Pen/Pencil', 'Bus', 'Wild Bird',
'High Heels', 'Motorcycle', 'Guitar', 'Carpet', 'Cell Phone', 'Bread', 'Camera', 'Canned', 'Truck',
'Traffic cone', 'Cymbal', 'Lifesaver', 'Towel', 'Stuffed Toy', 'Candle', 'Sailboat', 'Laptop', 'Awning',
'Bed', 'Faucet', 'Tent', 'Horse', 'Mirror', 'Power outlet', 'Sink', 'Apple', 'Air Conditioner', 'Knife',
'Hockey Stick', 'Paddle', 'Pickup Truck', 'Fork', 'Traffic Sign', 'Balloon', 'Tripod', 'Dog', 'Spoon', 'Clock',
'Pot', 'Cow', 'Cake', 'Dinning Table', 'Sheep', 'Hanger', 'Blackboard/Whiteboard', 'Napkin', 'Other Fish',
'Orange/Tangerine', 'Toiletry', 'Keyboard', 'Tomato', 'Lantern', 'Machinery Vehicle', 'Fan',
'Green Vegetables', 'Banana', 'Baseball Glove', 'Airplane', 'Mouse', 'Train', 'Pumpkin', 'Soccer', 'Skiboard',
'Luggage', 'Nightstand', 'Tea pot', 'Telephone', 'Trolley', 'Head Phone', 'Sports Car', 'Stop Sign',
'Dessert', 'Scooter', 'Stroller', 'Crane', 'Remote', 'Refrigerator', 'Oven', 'Lemon', 'Duck', 'Baseball Bat',
'Surveillance Camera', 'Cat', 'Jug', 'Broccoli', 'Piano', 'Pizza', 'Elephant', 'Skateboard', 'Surfboard',
'Gun', 'Skating and Skiing shoes', 'Gas stove', 'Donut', 'Bow Tie', 'Carrot', 'Toilet', 'Kite', 'Strawberry',
'Other Balls', 'Shovel', 'Pepper', 'Computer Box', 'Toilet Paper', 'Cleaning Products', 'Chopsticks',
'Microwave', 'Pigeon', 'Baseball', 'Cutting/chopping Board', 'Coffee Table', 'Side Table', 'Scissors',
'Marker', 'Pie', 'Ladder', 'Snowboard', 'Cookies', 'Radiator', 'Fire Hydrant', 'Basketball', 'Zebra', 'Grape',
'Giraffe', 'Potato', 'Sausage', 'Tricycle', 'Violin', 'Egg', 'Fire Extinguisher', 'Candy', 'Fire Truck',
'Billiards', 'Converter', 'Bathtub', 'Wheelchair', 'Golf Club', 'Briefcase', 'Cucumber', 'Cigar/Cigarette',
'Paint Brush', 'Pear', 'Heavy Truck', 'Hamburger', 'Extractor', 'Extension Cord', 'Tong', 'Tennis Racket',
'Folder', 'American Football', 'earphone', 'Mask', 'Kettle', 'Tennis', 'Ship', 'Swing', 'Coffee Machine',
'Slide', 'Carriage', 'Onion', 'Green beans', 'Projector', 'Frisbee', 'Washing Machine/Drying Machine',
'Chicken', 'Printer', 'Watermelon', 'Saxophone', 'Tissue', 'Toothbrush', 'Ice cream', 'Hot-air balloon',
'Cello', 'French Fries', 'Scale', 'Trophy', 'Cabbage', 'Hot dog', 'Blender', 'Peach', 'Rice', 'Wallet/Purse',
'Volleyball', 'Deer', 'Goose', 'Tape', 'Tablet', 'Cosmetics', 'Trumpet', 'Pineapple', 'Golf Ball',
'Ambulance', 'Parking meter', 'Mango', 'Key', 'Hurdle', 'Fishing Rod', 'Medal', 'Flute', 'Brush', 'Penguin',
'Megaphone', 'Corn', 'Lettuce', 'Garlic', 'Swan', 'Helicopter', 'Green Onion', 'Sandwich', 'Nuts',
'Speed Limit Sign', 'Induction Cooker', 'Broom', 'Trombone', 'Plum', 'Rickshaw', 'Goldfish', 'Kiwi fruit',
'Router/modem', 'Poker Card', 'Toaster', 'Shrimp', 'Sushi', 'Cheese', 'Notepaper', 'Cherry', 'Pliers', 'CD',
'Pasta', 'Hammer', 'Cue', 'Avocado', 'Hamimelon', 'Flask', 'Mushroom', 'Screwdriver', 'Soap', 'Recorder',
'Bear', 'Eggplant', 'Board Eraser', 'Coconut', 'Tape Measure/Ruler', 'Pig', 'Showerhead', 'Globe', 'Chips',
'Steak', 'Crosswalk Sign', 'Stapler', 'Camel', 'Formula 1', 'Pomegranate', 'Dishwasher', 'Crab',
'Hoverboard', 'Meat ball', 'Rice Cooker', 'Tuba', 'Calculator', 'Papaya', 'Antelope', 'Parrot', 'Seal',
'Butterfly', 'Dumbbell', 'Donkey', 'Lion', 'Urinal', 'Dolphin', 'Electric Drill', 'Hair Dryer', 'Egg tart',
'Jellyfish', 'Treadmill', 'Lighter', 'Grapefruit', 'Game board', 'Mop', 'Radish', 'Baozi', 'Target', 'French',
'Spring Rolls', 'Monkey', 'Rabbit', 'Pencil Case', 'Yak', 'Red Cabbage', 'Binoculars', 'Asparagus', 'Barbell',
'Scallop', 'Noddles', 'Comb', 'Dumpling', 'Oyster', 'Table Tennis paddle', 'Cosmetics Brush/Eyeliner Pencil',
'Chainsaw', 'Eraser', 'Lobster', 'Durian', 'Okra', 'Lipstick', 'Cosmetics Mirror', 'Curling', 'Table Tennis']
# Download script/URL (optional) ---------------------------------------------------------------------------------------
download: |
from pycocotools.coco import COCO
from tqdm import tqdm
from utils.general import Path, download, np, xyxy2xywhn
# Make Directories
dir = Path(yaml['path']) # dataset root dir
for p in 'images', 'labels':
(dir / p).mkdir(parents=True, exist_ok=True)
for q in 'train', 'val':
(dir / p / q).mkdir(parents=True, exist_ok=True)
# Train, Val Splits
for split, patches in [('train', 50 + 1), ('val', 43 + 1)]:
print(f"Processing {split} in {patches} patches ...")
images, labels = dir / 'images' / split, dir / 'labels' / split
# Download
url = f"https://dorc.ks3-cn-beijing.ksyun.com/data-set/2020Objects365%E6%95%B0%E6%8D%AE%E9%9B%86/{split}/"
if split == 'train':
download([f'{url}zhiyuan_objv2_{split}.tar.gz'], dir=dir, delete=False) # annotations json
download([f'{url}patch{i}.tar.gz' for i in range(patches)], dir=images, curl=True, delete=False, threads=8)
elif split == 'val':
download([f'{url}zhiyuan_objv2_{split}.json'], dir=dir, delete=False) # annotations json
download([f'{url}images/v1/patch{i}.tar.gz' for i in range(15 + 1)], dir=images, curl=True, delete=False, threads=8)
download([f'{url}images/v2/patch{i}.tar.gz' for i in range(16, patches)], dir=images, curl=True, delete=False, threads=8)
# Move
for f in tqdm(images.rglob('*.jpg'), desc=f'Moving {split} images'):
f.rename(images / f.name) # move to /images/{split}
# Labels
coco = COCO(dir / f'zhiyuan_objv2_{split}.json')
names = [x["name"] for x in coco.loadCats(coco.getCatIds())]
for cid, cat in enumerate(names):
catIds = coco.getCatIds(catNms=[cat])
imgIds = coco.getImgIds(catIds=catIds)
for im in tqdm(coco.loadImgs(imgIds), desc=f'Class {cid + 1}/{len(names)} {cat}'):
width, height = im["width"], im["height"]
path = Path(im["file_name"]) # image filename
try:
with open(labels / path.with_suffix('.txt').name, 'a') as file:
annIds = coco.getAnnIds(imgIds=im["id"], catIds=catIds, iscrowd=None)
for a in coco.loadAnns(annIds):
x, y, w, h = a['bbox'] # bounding box in xywh (xy top-left corner)
xyxy = np.array([x, y, x + w, y + h])[None] # pixels(1,4)
x, y, w, h = xyxy2xywhn(xyxy, w=width, h=height, clip=True)[0] # normalized and clipped
file.write(f"{cid} {x:.5f} {y:.5f} {w:.5f} {h:.5f}\n")
except Exception as e:
print(e)

@ -0,0 +1,52 @@
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# SKU-110K retail items dataset https://github.com/eg4000/SKU110K_CVPR19
# Example usage: python train.py --data SKU-110K.yaml
# parent
# ├── yolov5
# └── datasets
# └── SKU-110K ← downloads here
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/SKU-110K # dataset root dir
train: train.txt # train images (relative to 'path') 8219 images
val: val.txt # val images (relative to 'path') 588 images
test: test.txt # test images (optional) 2936 images
# Classes
nc: 1 # number of classes
names: ['object'] # class names
# Download script/URL (optional) ---------------------------------------------------------------------------------------
download: |
import shutil
from tqdm import tqdm
from utils.general import np, pd, Path, download, xyxy2xywh
# Download
dir = Path(yaml['path']) # dataset root dir
parent = Path(dir.parent) # download dir
urls = ['http://trax-geometry.s3.amazonaws.com/cvpr_challenge/SKU110K_fixed.tar.gz']
download(urls, dir=parent, delete=False)
# Rename directories
if dir.exists():
shutil.rmtree(dir)
(parent / 'SKU110K_fixed').rename(dir) # rename dir
(dir / 'labels').mkdir(parents=True, exist_ok=True) # create labels dir
# Convert labels
names = 'image', 'x1', 'y1', 'x2', 'y2', 'class', 'image_width', 'image_height' # column names
for d in 'annotations_train.csv', 'annotations_val.csv', 'annotations_test.csv':
x = pd.read_csv(dir / 'annotations' / d, names=names).values # annotations
images, unique_images = x[:, 0], np.unique(x[:, 0])
with open((dir / d).with_suffix('.txt').__str__().replace('annotations_', ''), 'w') as f:
f.writelines(f'./images/{s}\n' for s in unique_images)
for im in tqdm(unique_images, desc=f'Converting {dir / d}'):
cls = 0 # single-class dataset
with open((dir / 'labels' / im).with_suffix('.txt'), 'a') as f:
for r in x[images == im]:
w, h = r[6], r[7] # image width, height
xywh = xyxy2xywh(np.array([[r[1] / w, r[2] / h, r[3] / w, r[4] / h]]))[0] # instance
f.write(f"{cls} {xywh[0]:.5f} {xywh[1]:.5f} {xywh[2]:.5f} {xywh[3]:.5f}\n") # write label

@ -0,0 +1,80 @@
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# PASCAL VOC dataset http://host.robots.ox.ac.uk/pascal/VOC
# Example usage: python train.py --data VOC.yaml
# parent
# ├── yolov5
# └── datasets
# └── VOC ← downloads here
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/VOC
train: # train images (relative to 'path') 16551 images
- images/train2012
- images/train2007
- images/val2012
- images/val2007
val: # val images (relative to 'path') 4952 images
- images/test2007
test: # test images (optional)
- images/test2007
# Classes
nc: 20 # number of classes
names: ['aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat', 'chair', 'cow', 'diningtable', 'dog',
'horse', 'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor'] # class names
# Download script/URL (optional) ---------------------------------------------------------------------------------------
download: |
import xml.etree.ElementTree as ET
from tqdm import tqdm
from utils.general import download, Path
def convert_label(path, lb_path, year, image_id):
def convert_box(size, box):
dw, dh = 1. / size[0], 1. / size[1]
x, y, w, h = (box[0] + box[1]) / 2.0 - 1, (box[2] + box[3]) / 2.0 - 1, box[1] - box[0], box[3] - box[2]
return x * dw, y * dh, w * dw, h * dh
in_file = open(path / f'VOC{year}/Annotations/{image_id}.xml')
out_file = open(lb_path, 'w')
tree = ET.parse(in_file)
root = tree.getroot()
size = root.find('size')
w = int(size.find('width').text)
h = int(size.find('height').text)
for obj in root.iter('object'):
cls = obj.find('name').text
if cls in yaml['names'] and not int(obj.find('difficult').text) == 1:
xmlbox = obj.find('bndbox')
bb = convert_box((w, h), [float(xmlbox.find(x).text) for x in ('xmin', 'xmax', 'ymin', 'ymax')])
cls_id = yaml['names'].index(cls) # class id
out_file.write(" ".join([str(a) for a in (cls_id, *bb)]) + '\n')
# Download
dir = Path(yaml['path']) # dataset root dir
url = 'https://github.com/ultralytics/yolov5/releases/download/v1.0/'
urls = [url + 'VOCtrainval_06-Nov-2007.zip', # 446MB, 5012 images
url + 'VOCtest_06-Nov-2007.zip', # 438MB, 4953 images
url + 'VOCtrainval_11-May-2012.zip'] # 1.95GB, 17126 images
download(urls, dir=dir / 'images', delete=False)
# Convert
path = dir / f'images/VOCdevkit'
for year, image_set in ('2012', 'train'), ('2012', 'val'), ('2007', 'train'), ('2007', 'val'), ('2007', 'test'):
imgs_path = dir / 'images' / f'{image_set}{year}'
lbs_path = dir / 'labels' / f'{image_set}{year}'
imgs_path.mkdir(exist_ok=True, parents=True)
lbs_path.mkdir(exist_ok=True, parents=True)
image_ids = open(path / f'VOC{year}/ImageSets/Main/{image_set}.txt').read().strip().split()
for id in tqdm(image_ids, desc=f'{image_set}{year}'):
f = path / f'VOC{year}/JPEGImages/{id}.jpg' # old img path
lb_path = (lbs_path / f.name).with_suffix('.txt') # new label path
f.rename(imgs_path / f.name) # move image
convert_label(path, lb_path, year, id) # convert labels to YOLO format

@ -0,0 +1,61 @@
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# VisDrone2019-DET dataset https://github.com/VisDrone/VisDrone-Dataset
# Example usage: python train.py --data VisDrone.yaml
# parent
# ├── yolov5
# └── datasets
# └── VisDrone ← downloads here
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/VisDrone # dataset root dir
train: VisDrone2019-DET-train/images # train images (relative to 'path') 6471 images
val: VisDrone2019-DET-val/images # val images (relative to 'path') 548 images
test: VisDrone2019-DET-test-dev/images # test images (optional) 1610 images
# Classes
nc: 10 # number of classes
names: ['pedestrian', 'people', 'bicycle', 'car', 'van', 'truck', 'tricycle', 'awning-tricycle', 'bus', 'motor']
# Download script/URL (optional) ---------------------------------------------------------------------------------------
download: |
from utils.general import download, os, Path
def visdrone2yolo(dir):
from PIL import Image
from tqdm import tqdm
def convert_box(size, box):
# Convert VisDrone box to YOLO xywh box
dw = 1. / size[0]
dh = 1. / size[1]
return (box[0] + box[2] / 2) * dw, (box[1] + box[3] / 2) * dh, box[2] * dw, box[3] * dh
(dir / 'labels').mkdir(parents=True, exist_ok=True) # make labels directory
pbar = tqdm((dir / 'annotations').glob('*.txt'), desc=f'Converting {dir}')
for f in pbar:
img_size = Image.open((dir / 'images' / f.name).with_suffix('.jpg')).size
lines = []
with open(f, 'r') as file: # read annotation.txt
for row in [x.split(',') for x in file.read().strip().splitlines()]:
if row[4] == '0': # VisDrone 'ignored regions' class 0
continue
cls = int(row[5]) - 1
box = convert_box(img_size, tuple(map(int, row[:4])))
lines.append(f"{cls} {' '.join(f'{x:.6f}' for x in box)}\n")
with open(str(f).replace(os.sep + 'annotations' + os.sep, os.sep + 'labels' + os.sep), 'w') as fl:
fl.writelines(lines) # write label.txt
# Download
dir = Path(yaml['path']) # dataset root dir
urls = ['https://github.com/ultralytics/yolov5/releases/download/v1.0/VisDrone2019-DET-train.zip',
'https://github.com/ultralytics/yolov5/releases/download/v1.0/VisDrone2019-DET-val.zip',
'https://github.com/ultralytics/yolov5/releases/download/v1.0/VisDrone2019-DET-test-dev.zip',
'https://github.com/ultralytics/yolov5/releases/download/v1.0/VisDrone2019-DET-test-challenge.zip']
download(urls, dir=dir)
# Convert
for d in 'VisDrone2019-DET-train', 'VisDrone2019-DET-val', 'VisDrone2019-DET-test-dev':
visdrone2yolo(dir / d) # convert VisDrone annotations to YOLO labels

@ -0,0 +1,44 @@
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# COCO 2017 dataset http://cocodataset.org
# Example usage: python train.py --data coco.yaml
# parent
# ├── yolov5
# └── datasets
# └── coco ← downloads here
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/coco # dataset root dir
train: train2017.txt # train images (relative to 'path') 118287 images
val: val2017.txt # train images (relative to 'path') 5000 images
test: test-dev2017.txt # 20288 of 40670 images, submit to https://competitions.codalab.org/competitions/20794
# Classes
nc: 80 # number of classes
names: ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
'hair drier', 'toothbrush'] # class names
# Download script/URL (optional)
download: |
from utils.general import download, Path
# Download labels
segments = False # segment or box labels
dir = Path(yaml['path']) # dataset root dir
url = 'https://github.com/ultralytics/yolov5/releases/download/v1.0/'
urls = [url + ('coco2017labels-segments.zip' if segments else 'coco2017labels.zip')] # labels
download(urls, dir=dir.parent)
# Download data
urls = ['http://images.cocodataset.org/zips/train2017.zip', # 19G, 118k images
'http://images.cocodataset.org/zips/val2017.zip', # 1G, 5k images
'http://images.cocodataset.org/zips/test2017.zip'] # 7G, 41k images (optional)
download(urls, dir=dir / 'images', threads=3)

@ -0,0 +1,30 @@
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# COCO128 dataset https://www.kaggle.com/ultralytics/coco128 (first 128 images from COCO train2017)
# Example usage: python train.py --data coco128.yaml
# parent
# ├── yolov5
# └── datasets
# └── coco128 ← downloads here
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/coco128 # dataset root dir
train: images/train2017 # train images (relative to 'path') 128 images
val: images/train2017 # val images (relative to 'path') 128 images
test: # test images (optional)
# Classes
nc: 80 # number of classes
names: ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
'hair drier', 'toothbrush'] # class names
# Download script/URL (optional)
download: https://ultralytics.com/assets/coco128.zip

@ -0,0 +1,13 @@
# Custom data for safety helmet
# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
train: /home/data/yolo_format/images/train
val: /home/data/yolo_format/images/val
# number of classes
nc: 2
# class names
names: ['phone', 'person']

@ -0,0 +1,39 @@
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Hyperparameters for VOC finetuning
# python train.py --batch 64 --weights yolov5m.pt --data VOC.yaml --img 512 --epochs 50
# See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials
# Hyperparameter Evolution Results
# Generations: 306
# P R mAP.5 mAP.5:.95 box obj cls
# Metrics: 0.6 0.936 0.896 0.684 0.0115 0.00805 0.00146
lr0: 0.0032
lrf: 0.12
momentum: 0.843
weight_decay: 0.00036
warmup_epochs: 2.0
warmup_momentum: 0.5
warmup_bias_lr: 0.05
box: 0.0296
cls: 0.243
cls_pw: 0.631
obj: 0.301
obj_pw: 0.911
iou_t: 0.2
anchor_t: 2.91
# anchors: 3.63
fl_gamma: 0.0
hsv_h: 0.0138
hsv_s: 0.664
hsv_v: 0.464
degrees: 0.373
translate: 0.245
scale: 0.898
shear: 0.602
perspective: 0.0
flipud: 0.00856
fliplr: 0.5
mosaic: 1.0
mixup: 0.243
copy_paste: 0.0

@ -0,0 +1,31 @@
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
lr0: 0.00258
lrf: 0.17
momentum: 0.779
weight_decay: 0.00058
warmup_epochs: 1.33
warmup_momentum: 0.86
warmup_bias_lr: 0.0711
box: 0.0539
cls: 0.299
cls_pw: 0.825
obj: 0.632
obj_pw: 1.0
iou_t: 0.2
anchor_t: 3.44
anchors: 3.2
fl_gamma: 0.0
hsv_h: 0.0188
hsv_s: 0.704
hsv_v: 0.36
degrees: 0.0
translate: 0.0902
scale: 0.491
shear: 0.0
perspective: 0.0
flipud: 0.0
fliplr: 0.5
mosaic: 1.0
mixup: 0.0
copy_paste: 0.0

@ -0,0 +1,34 @@
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Hyperparameters for high-augmentation COCO training from scratch
# python train.py --batch 32 --cfg yolov5m6.yaml --weights '' --data coco.yaml --img 1280 --epochs 300
# See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials
lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)
lrf: 0.2 # final OneCycleLR learning rate (lr0 * lrf)
momentum: 0.937 # SGD momentum/Adam beta1
weight_decay: 0.0005 # optimizer weight decay 5e-4
warmup_epochs: 3.0 # warmup epochs (fractions ok)
warmup_momentum: 0.8 # warmup initial momentum
warmup_bias_lr: 0.1 # warmup initial bias lr
box: 0.05 # box loss gain
cls: 0.3 # cls loss gain
cls_pw: 1.0 # cls BCELoss positive_weight
obj: 0.7 # obj loss gain (scale with pixels)
obj_pw: 1.0 # obj BCELoss positive_weight
iou_t: 0.20 # IoU training threshold
anchor_t: 4.0 # anchor-multiple threshold
# anchors: 3 # anchors per output layer (0 to ignore)
fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4 # image HSV-Value augmentation (fraction)
degrees: 0.0 # image rotation (+/- deg)
translate: 0.1 # image translation (+/- fraction)
scale: 0.9 # image scale (+/- gain)
shear: 0.0 # image shear (+/- deg)
perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
flipud: 0.0 # image flip up-down (probability)
fliplr: 0.5 # image flip left-right (probability)
mosaic: 1.0 # image mosaic (probability)
mixup: 0.1 # image mixup (probability)
copy_paste: 0.1 # segment copy-paste (probability)

@ -0,0 +1,34 @@
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Hyperparameters for low-augmentation COCO training from scratch
# python train.py --batch 64 --cfg yolov5n6.yaml --weights '' --data coco.yaml --img 640 --epochs 300 --linear
# See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials
lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)
lrf: 0.01 # final OneCycleLR learning rate (lr0 * lrf)
momentum: 0.937 # SGD momentum/Adam beta1
weight_decay: 0.0005 # optimizer weight decay 5e-4
warmup_epochs: 3.0 # warmup epochs (fractions ok)
warmup_momentum: 0.8 # warmup initial momentum
warmup_bias_lr: 0.1 # warmup initial bias lr
box: 0.05 # box loss gain
cls: 0.5 # cls loss gain
cls_pw: 1.0 # cls BCELoss positive_weight
obj: 1.0 # obj loss gain (scale with pixels)
obj_pw: 1.0 # obj BCELoss positive_weight
iou_t: 0.20 # IoU training threshold
anchor_t: 4.0 # anchor-multiple threshold
# anchors: 3 # anchors per output layer (0 to ignore)
fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4 # image HSV-Value augmentation (fraction)
degrees: 0.0 # image rotation (+/- deg)
translate: 0.1 # image translation (+/- fraction)
scale: 0.5 # image scale (+/- gain)
shear: 0.0 # image shear (+/- deg)
perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
flipud: 0.0 # image flip up-down (probability)
fliplr: 0.5 # image flip left-right (probability)
mosaic: 1.0 # image mosaic (probability)
mixup: 0.0 # image mixup (probability)
copy_paste: 0.0 # segment copy-paste (probability)

@ -0,0 +1,34 @@
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Hyperparameters for medium-augmentation COCO training from scratch
# python train.py --batch 32 --cfg yolov5m6.yaml --weights '' --data coco.yaml --img 1280 --epochs 300
# See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials
lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)
lrf: 0.1 # final OneCycleLR learning rate (lr0 * lrf)
momentum: 0.937 # SGD momentum/Adam beta1
weight_decay: 0.0005 # optimizer weight decay 5e-4
warmup_epochs: 3.0 # warmup epochs (fractions ok)
warmup_momentum: 0.8 # warmup initial momentum
warmup_bias_lr: 0.1 # warmup initial bias lr
box: 0.05 # box loss gain
cls: 0.3 # cls loss gain
cls_pw: 1.0 # cls BCELoss positive_weight
obj: 0.7 # obj loss gain (scale with pixels)
obj_pw: 1.0 # obj BCELoss positive_weight
iou_t: 0.20 # IoU training threshold
anchor_t: 4.0 # anchor-multiple threshold
# anchors: 3 # anchors per output layer (0 to ignore)
fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4 # image HSV-Value augmentation (fraction)
degrees: 0.0 # image rotation (+/- deg)
translate: 0.1 # image translation (+/- fraction)
scale: 0.9 # image scale (+/- gain)
shear: 0.0 # image shear (+/- deg)
perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
flipud: 0.0 # image flip up-down (probability)
fliplr: 0.5 # image flip left-right (probability)
mosaic: 1.0 # image mosaic (probability)
mixup: 0.1 # image mixup (probability)
copy_paste: 0.0 # segment copy-paste (probability)

@ -0,0 +1,34 @@
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Hyperparameters for COCO training from scratch
# python train.py --batch 40 --cfg yolov5m.yaml --weights '' --data coco.yaml --img 640 --epochs 300
# See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials
lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)
lrf: 0.1 # final OneCycleLR learning rate (lr0 * lrf)
momentum: 0.937 # SGD momentum/Adam beta1
weight_decay: 0.0005 # optimizer weight decay 5e-4
warmup_epochs: 3.0 # warmup epochs (fractions ok)
warmup_momentum: 0.8 # warmup initial momentum
warmup_bias_lr: 0.1 # warmup initial bias lr
box: 0.05 # box loss gain
cls: 0.5 # cls loss gain
cls_pw: 1.0 # cls BCELoss positive_weight
obj: 1.0 # obj loss gain (scale with pixels)
obj_pw: 1.0 # obj BCELoss positive_weight
iou_t: 0.20 # IoU training threshold
anchor_t: 4.0 # anchor-multiple threshold
# anchors: 3 # anchors per output layer (0 to ignore)
fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4 # image HSV-Value augmentation (fraction)
degrees: 0.0 # image rotation (+/- deg)
translate: 0.1 # image translation (+/- fraction)
scale: 0.5 # image scale (+/- gain)
shear: 0.0 # image shear (+/- deg)
perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
flipud: 0.0 # image flip up-down (probability)
fliplr: 0.5 # image flip left-right (probability)
mosaic: 1.0 # image mosaic (probability)
mixup: 0.0 # image mixup (probability)
copy_paste: 0.0 # segment copy-paste (probability)

Binary file not shown.

After

Width:  |  Height:  |  Size: 476 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 74 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 392 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 223 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 181 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 197 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 646 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 344 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 315 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 444 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 165 KiB

@ -0,0 +1,12 @@
# Custom data for safety helmet
# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
train: F:/up/1212/YOLO_Mask/score/images/train
val: F:/up/1212/YOLO_Mask/score/images/val
# number of classes
nc: 2
# class names
names: ['mask', 'face']

@ -0,0 +1,20 @@
#!/bin/bash
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Download latest models from https://github.com/ultralytics/yolov5/releases
# Example usage: bash path/to/download_weights.sh
# parent
# └── yolov5
# ├── yolov5s.pt ← downloads here
# ├── yolov5m.pt
# └── ...
python - <<EOF
from utils.downloads import attempt_download
models = ['n', 's', 'm', 'l', 'x']
models.extend([x + '6' for x in models]) # add P6 models
for x in models:
attempt_download(f'yolov5{x}.pt')
EOF

@ -0,0 +1,27 @@
#!/bin/bash
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Download COCO 2017 dataset http://cocodataset.org
# Example usage: bash data/scripts/get_coco.sh
# parent
# ├── yolov5
# └── datasets
# └── coco ← downloads here
# Download/unzip labels
d='../datasets' # unzip directory
url=https://github.com/ultralytics/yolov5/releases/download/v1.0/
f='coco2017labels.zip' # or 'coco2017labels-segments.zip', 68 MB
echo 'Downloading' $url$f ' ...'
curl -L $url$f -o $f && unzip -q $f -d $d && rm $f &
# Download/unzip images
d='../datasets/coco/images' # unzip directory
url=http://images.cocodataset.org/zips/
f1='train2017.zip' # 19G, 118k images
f2='val2017.zip' # 1G, 5k images
f3='test2017.zip' # 7G, 41k images (optional)
for f in $f1 $f2; do
echo 'Downloading' $url$f '...'
curl -L $url$f -o $f && unzip -q $f -d $d && rm $f &
done
wait # finish background tasks

@ -0,0 +1,17 @@
#!/bin/bash
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Download COCO128 dataset https://www.kaggle.com/ultralytics/coco128 (first 128 images from COCO train2017)
# Example usage: bash data/scripts/get_coco128.sh
# parent
# ├── yolov5
# └── datasets
# └── coco128 ← downloads here
# Download/unzip images and labels
d='../datasets' # unzip directory
url=https://github.com/ultralytics/yolov5/releases/download/v1.0/
f='coco128.zip' # or 'coco128-segments.zip', 68 MB
echo 'Downloading' $url$f ' ...'
curl -L $url$f -o $f && unzip -q $f -d $d && rm $f &
wait # finish background tasks

@ -0,0 +1,102 @@
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# xView 2018 dataset https://challenge.xviewdataset.org
# -------- DOWNLOAD DATA MANUALLY from URL above and unzip to 'datasets/xView' before running train command! --------
# Example usage: python train.py --data xView.yaml
# parent
# ├── yolov5
# └── datasets
# └── xView ← downloads here
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/xView # dataset root dir
train: images/autosplit_train.txt # train images (relative to 'path') 90% of 847 train images
val: images/autosplit_val.txt # train images (relative to 'path') 10% of 847 train images
# Classes
nc: 60 # number of classes
names: ['Fixed-wing Aircraft', 'Small Aircraft', 'Cargo Plane', 'Helicopter', 'Passenger Vehicle', 'Small Car', 'Bus',
'Pickup Truck', 'Utility Truck', 'Truck', 'Cargo Truck', 'Truck w/Box', 'Truck Tractor', 'Trailer',
'Truck w/Flatbed', 'Truck w/Liquid', 'Crane Truck', 'Railway Vehicle', 'Passenger Car', 'Cargo Car',
'Flat Car', 'Tank car', 'Locomotive', 'Maritime Vessel', 'Motorboat', 'Sailboat', 'Tugboat', 'Barge',
'Fishing Vessel', 'Ferry', 'Yacht', 'Container Ship', 'Oil Tanker', 'Engineering Vehicle', 'Tower crane',
'Container Crane', 'Reach Stacker', 'Straddle Carrier', 'Mobile Crane', 'Dump Truck', 'Haul Truck',
'Scraper/Tractor', 'Front loader/Bulldozer', 'Excavator', 'Cement Mixer', 'Ground Grader', 'Hut/Tent', 'Shed',
'Building', 'Aircraft Hangar', 'Damaged Building', 'Facility', 'Construction Site', 'Vehicle Lot', 'Helipad',
'Storage Tank', 'Shipping container lot', 'Shipping Container', 'Pylon', 'Tower'] # class names
# Download script/URL (optional) ---------------------------------------------------------------------------------------
download: |
import json
import os
from pathlib import Path
import numpy as np
from PIL import Image
from tqdm import tqdm
from utils.datasets import autosplit
from utils.general import download, xyxy2xywhn
def convert_labels(fname=Path('xView/xView_train.geojson')):
# Convert xView geoJSON labels to YOLO format
path = fname.parent
with open(fname) as f:
print(f'Loading {fname}...')
data = json.load(f)
# Make dirs
labels = Path(path / 'labels' / 'train')
os.system(f'rm -rf {labels}')
labels.mkdir(parents=True, exist_ok=True)
# xView classes 11-94 to 0-59
xview_class2index = [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0, 1, 2, -1, 3, -1, 4, 5, 6, 7, 8, -1, 9, 10, 11,
12, 13, 14, 15, -1, -1, 16, 17, 18, 19, 20, 21, 22, -1, 23, 24, 25, -1, 26, 27, -1, 28, -1,
29, 30, 31, 32, 33, 34, 35, 36, 37, -1, 38, 39, 40, 41, 42, 43, 44, 45, -1, -1, -1, -1, 46,
47, 48, 49, -1, 50, 51, -1, 52, -1, -1, -1, 53, 54, -1, 55, -1, -1, 56, -1, 57, -1, 58, 59]
shapes = {}
for feature in tqdm(data['features'], desc=f'Converting {fname}'):
p = feature['properties']
if p['bounds_imcoords']:
id = p['image_id']
file = path / 'train_images' / id
if file.exists(): # 1395.tif missing
try:
box = np.array([int(num) for num in p['bounds_imcoords'].split(",")])
assert box.shape[0] == 4, f'incorrect box shape {box.shape[0]}'
cls = p['type_id']
cls = xview_class2index[int(cls)] # xView class to 0-60
assert 59 >= cls >= 0, f'incorrect class index {cls}'
# Write YOLO label
if id not in shapes:
shapes[id] = Image.open(file).size
box = xyxy2xywhn(box[None].astype(np.float), w=shapes[id][0], h=shapes[id][1], clip=True)
with open((labels / id).with_suffix('.txt'), 'a') as f:
f.write(f"{cls} {' '.join(f'{x:.6f}' for x in box[0])}\n") # write label.txt
except Exception as e:
print(f'WARNING: skipping one label for {file}: {e}')
# Download manually from https://challenge.xviewdataset.org
dir = Path(yaml['path']) # dataset root dir
# urls = ['https://d307kc0mrhucc3.cloudfront.net/train_labels.zip', # train labels
# 'https://d307kc0mrhucc3.cloudfront.net/train_images.zip', # 15G, 847 train images
# 'https://d307kc0mrhucc3.cloudfront.net/val_images.zip'] # 5G, 282 val images (no labels)
# download(urls, dir=dir, delete=False)
# Convert labels
convert_labels(dir / 'xView_train.geojson')
# Move images
images = Path(dir / 'images')
images.mkdir(parents=True, exist_ok=True)
Path(dir / 'train_images').rename(dir / 'images' / 'train')
Path(dir / 'val_images').rename(dir / 'images' / 'val')
# Split
autosplit(dir / 'images' / 'train')

@ -0,0 +1,236 @@
# -*- coding: utf-8 -*-
"""
-------------------------------------------------
Project Name: yolov5-jishi
File Name: data_gen.py
Author: chenming
Create Date: 2021/11/8
Description
-------------------------------------------------
"""
# -*- coding: utf-8 -*-
# @Time : 20210610
# @Author : dejahu
# @File : gen_yolo_data.py
# @Software: PyCharm
# @Brief : 生成测试、验证、训练的图片和标签
import os
import shutil
from pathlib import Path
from shutil import copyfile
import cv2
from PIL import Image, ImageDraw
from xml.dom.minidom import parse
import numpy as np
import os.path as osp
import random
# main首先在当前目录生成数据的划分
# 开始转化数据集
# 更换代码中images替换的逻辑和文件对应起来
# todo 修改为你的数据的根目录
FILE_ROOT = "/scm/data/xianyu/Mask/"
IMAGE_SET_ROOT = FILE_ROOT + "VOC2021_Mask/ImageSets/Main" # 图片区分文件的路径
IMAGE_PATH = FILE_ROOT + "VOC2021_Mask/JPEGImages" # 图片的位置
ANNOTATIONS_PATH = FILE_ROOT + "VOC2021_Mask/Annotations" # 数据集标签文件的位置
LABELS_ROOT = FILE_ROOT + "VOC2021_Mask/Labels" # 进行归一化之后的标签位置
DEST_PPP = FILE_ROOT + "mask_yolo_format"
DEST_IMAGES_PATH = "mask_yolo_format/images" # 区分训练集、测试集、验证集的图片目标路径
DEST_LABELS_PATH = "mask_yolo_format/labels" # 区分训练集、测试集、验证集的标签文件目标路径
if osp.isdir(LABELS_ROOT):
shutil.rmtree(LABELS_ROOT)
print("Labels存在已删除")
if osp.isdir(DEST_PPP):
shutil.rmtree(DEST_PPP)
print("Dest目录存在已删除")
# todo 修改为你数据集的标签名称
label_names = ['face', 'face_mask']
def cord_converter(size, box):
"""
将标注的 xml 文件标注转换为 darknet 形的坐标
:param size: 图片的尺寸 [w,h]
:param box: anchor box 的坐标 [左上角x,左上角y,右下角x,右下角y,]
:return: 转换后的 [x,y,w,h]
"""
x1 = int(box[0])
y1 = int(box[1])
x2 = int(box[2])
y2 = int(box[3])
dw = np.float32(1. / int(size[0]))
dh = np.float32(1. / int(size[1]))
w = x2 - x1
h = y2 - y1
x = x1 + (w / 2)
y = y1 + (h / 2)
x = x * dw
w = w * dw
y = y * dh
h = h * dh
return [x, y, w, h]
def save_file(img_jpg_file_name, size, img_box):
# print("保存图片")
save_file_name = LABELS_ROOT + '/' + img_jpg_file_name + '.txt'
# print(save_file_name)
file_path = open(save_file_name, "a+")
# 默认给定的是id为0防止错误数据的出现
for box in img_box:
box_name = box[0]
cls_num = 0
if box_name in label_names:
cls_num = label_names.index(box_name)
new_box = cord_converter(size, box[1:])
file_path.write(f"{cls_num} {new_box[0]} {new_box[1]} {new_box[2]} {new_box[3]}\n")
file_path.flush()
file_path.close()
def test_dataset_box_feature(file_name, point_array):
"""
使用样本数据测试数据集的建议框
:param image_name: 图片文件名
:param point_array: 全部的点 [建议框sx1,sy1,sx2,sy2]
:return: None
"""
im = Image.open(rf"{IMAGE_PATH}\{file_name}")
imDraw = ImageDraw.Draw(im)
for box in point_array:
x1 = box[1]
y1 = box[2]
x2 = box[3]
y2 = box[4]
imDraw.rectangle((x1, y1, x2, y2), outline='red')
im.show()
def get_xml_data(file_path, img_xml_file):
img_path = file_path + '/' + img_xml_file + '.xml'
# print(img_path)
dom = parse(img_path)
root = dom.documentElement
img_name = root.getElementsByTagName("filename")[0].childNodes[0].data
img_jpg_file_name = img_xml_file + '.jpg'
# print(img_jpg_file_name)
cv2.imread(img_jpg_file_name)
img_size = root.getElementsByTagName("size")[0]
if len(img_size) == 0:
img_h, img_w, c = cv2.imread(img_jpg_file_name).shape
else:
img_w = img_size.getElementsByTagName("width")[0].childNodes[0].data
img_h = img_size.getElementsByTagName("height")[0].childNodes[0].data
img_c = img_size.getElementsByTagName("depth")[0].childNodes[0].data
objects = root.getElementsByTagName("object")
img_box = []
for box in objects:
cls_name = box.getElementsByTagName("name")[0].childNodes[0].data
# todo 更换坐标点转换的逻辑
x1 = int(float(box.getElementsByTagName("xmin")[0].childNodes[0].data))
y1 = int(float(box.getElementsByTagName("ymin")[0].childNodes[0].data))
x2 = int(float(box.getElementsByTagName("xmax")[0].childNodes[0].data))
y2 = int(float(box.getElementsByTagName("ymax")[0].childNodes[0].data))
# print("box:(c,xmin,ymin,xmax,ymax)", cls_name, x1, y1, x2, y2)
img_box.append([cls_name, x1, y1, x2, y2])
# test_dataset_box_feature(img_jpg_file_name, img_box)
save_file(img_xml_file, [img_w, img_h], img_box)
def copy_data(img_set_source, img_labels_root, imgs_source, type):
file_name = img_set_source + '/' + type + ".txt"
file = open(file_name)
# 判断文件夹是否存在,不存在则创建
root_file = Path(FILE_ROOT + DEST_IMAGES_PATH + '/' + type)
if not root_file.exists():
print(f"Path {root_file} is not exit")
os.makedirs(root_file)
root_file = Path(FILE_ROOT + DEST_LABELS_PATH + '/' + type)
if not root_file.exists():
print(f"Path {root_file} is not exit")
os.makedirs(root_file)
# 遍历文件夹
for line in file.readlines():
# print(line)
img_name = line.strip('\n')
img_sor_file = imgs_source + '/' + img_name + '.jpg'
label_sor_file = img_labels_root + '/' + img_name + '.txt'
# 复制图片
DICT_DIR = FILE_ROOT + DEST_IMAGES_PATH + '/' + type
img_dict_file = DICT_DIR + '/' + img_name + '.jpg'
copyfile(img_sor_file, img_dict_file)
# 复制 label
DICT_DIR = FILE_ROOT + DEST_LABELS_PATH + '/' + type
img_dict_file = DICT_DIR + '/' + img_name + '.txt'
copyfile(label_sor_file, img_dict_file)
if __name__ == '__main__':
# 将文件进行 train 和 val 的区分
# 用于生成测试集使用,目前还在测试阶段,暂时不发布
img_set_root = IMAGE_SET_ROOT
imgs_root = IMAGE_PATH
img_labels_root = LABELS_ROOT
if osp.isdir(img_labels_root) == False:
os.makedirs(img_labels_root)
os.makedirs(DEST_PPP)
names = os.listdir(ANNOTATIONS_PATH)
# real_names = []
# for name in names:
# if name.split(".")[-1] == "xml":
# real_names.append(name.split(".")[0])
# # print(real_names)
# # print(real_names)
# random.shuffle(real_names)
# # print(real_names)
# length = len(real_names)
# split_point = int(length * 0.2)
#
# val_names = real_names[:split_point]
# train_names = real_names[split_point:]
#
# # 开始生成文件
# np.savetxt('data/val.txt', np.array(val_names), fmt="%s", delimiter="\n")
# np.savetxt('data/test.txt', np.array(val_names), fmt="%s", delimiter="\n")
# np.savetxt('data/train.txt', np.array(train_names), fmt="%s", delimiter="\n")
# print("txt文件生成完毕请放在VOC2012的ImageSets/Main的目录下")
# 生成标签
root = ANNOTATIONS_PATH
files_all = os.listdir(root)
files = []
for file in files_all:
if file.split(".")[-1] == "xml":
files.append(file.split(".")[0])
# print(len(files))
for file in files:
# print("file name: ", file)
file_xml = file.split(".")
# try:
get_xml_data(root, file_xml[0])
# except:
# print(file_xml[0])
# break
# copy_data(img_set_root, img_labels_root, imgs_root, "train")
# copy_data(img_set_root, img_labels_root, imgs_root, "val")
# copy_data(img_set_root, img_labels_root, imgs_root, "test")
# data = list(np.loadtxt("tt100k.txt", dtype=str))
# print(data)
# print(len(data))
print("数据转换已完成!")

@ -0,0 +1,246 @@
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
"""
Run inference on images, videos, directories, streams, etc.
Usage:
$ python path/to/detect.py --weights yolov5s.pt --source 0 # webcam
img.jpg # image
vid.mp4 # video
path/ # directory
path/*.jpg # glob
'https://youtu.be/Zgi9g1ksQHc' # YouTube
'rtsp://example.com/media.mp4' # RTSP, RTMP, HTTP stream
"""
import argparse
import os
import sys
from pathlib import Path
import cv2
import torch
import torch.backends.cudnn as cudnn
FILE = Path(__file__).resolve()
ROOT = FILE.parents[0] # YOLOv5 root directory
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT)) # add ROOT to PATH
ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative
from models.common import DetectMultiBackend
from utils.datasets import IMG_FORMATS, VID_FORMATS, LoadImages, LoadStreams
from utils.general import (LOGGER, check_file, check_img_size, check_imshow, check_requirements, colorstr,
increment_path, non_max_suppression, print_args, scale_coords, strip_optimizer, xyxy2xywh)
from utils.plots import Annotator, colors, save_one_box
from utils.torch_utils import select_device, time_sync
@torch.no_grad()
def run(weights=ROOT / 'yolov5s.pt', # model.pt path(s)
source=ROOT / 'data/images', # file/dir/URL/glob, 0 for webcam
imgsz=640, # inference size (pixels)
conf_thres=0.25, # confidence threshold
iou_thres=0.45, # NMS IOU threshold
max_det=1000, # maximum detections per image
device='', # cuda device, i.e. 0 or 0,1,2,3 or cpu
view_img=False, # show results
save_txt=False, # save results to *.txt
save_conf=False, # save confidences in --save-txt labels
save_crop=False, # save cropped prediction boxes
nosave=False, # do not save images/videos
classes=None, # filter by class: --class 0, or --class 0 2 3
agnostic_nms=False, # class-agnostic NMS
augment=False, # augmented inference
visualize=False, # visualize features
update=False, # update all models
project=ROOT / 'runs/detect', # save results to project/name
name='exp', # save results to project/name
exist_ok=False, # existing project/name ok, do not increment
line_thickness=3, # bounding box thickness (pixels)
hide_labels=False, # hide labels
hide_conf=False, # hide confidences
half=False, # use FP16 half-precision inference
dnn=False, # use OpenCV DNN for ONNX inference
):
source = str(source)
save_img = not nosave and not source.endswith('.txt') # save inference images
is_file = Path(source).suffix[1:] in (IMG_FORMATS + VID_FORMATS)
is_url = source.lower().startswith(('rtsp://', 'rtmp://', 'http://', 'https://'))
webcam = source.isnumeric() or source.endswith('.txt') or (is_url and not is_file)
if is_url and is_file:
source = check_file(source) # download
# Directories
save_dir = increment_path(Path(project) / name, exist_ok=exist_ok) # increment run
(save_dir / 'labels' if save_txt else save_dir).mkdir(parents=True, exist_ok=True) # make dir
# Load model
device = select_device(device)
model = DetectMultiBackend(weights, device=device, dnn=dnn)
stride, names, pt, jit, onnx = model.stride, model.names, model.pt, model.jit, model.onnx
imgsz = check_img_size(imgsz, s=stride) # check image size
# Half
half &= pt and device.type != 'cpu' # half precision only supported by PyTorch on CUDA
if pt:
model.model.half() if half else model.model.float()
# Dataloader
if webcam:
view_img = check_imshow()
cudnn.benchmark = True # set True to speed up constant image size inference
dataset = LoadStreams(source, img_size=imgsz, stride=stride, auto=pt and not jit)
bs = len(dataset) # batch_size
else:
dataset = LoadImages(source, img_size=imgsz, stride=stride, auto=pt and not jit)
bs = 1 # batch_size
vid_path, vid_writer = [None] * bs, [None] * bs
# Run inference
if pt and device.type != 'cpu':
model(torch.zeros(1, 3, *imgsz).to(device).type_as(next(model.model.parameters()))) # warmup
dt, seen = [0.0, 0.0, 0.0], 0
for path, im, im0s, vid_cap, s in dataset:
t1 = time_sync()
im = torch.from_numpy(im).to(device)
im = im.half() if half else im.float() # uint8 to fp16/32
im /= 255 # 0 - 255 to 0.0 - 1.0
if len(im.shape) == 3:
im = im[None] # expand for batch dim
t2 = time_sync()
dt[0] += t2 - t1
# Inference
visualize = increment_path(save_dir / Path(path).stem, mkdir=True) if visualize else False
pred = model(im, augment=augment, visualize=visualize)
t3 = time_sync()
dt[1] += t3 - t2
# NMS
pred = non_max_suppression(pred, conf_thres, iou_thres, classes, agnostic_nms, max_det=max_det)
dt[2] += time_sync() - t3
# Second-stage classifier (optional)
# pred = utils.general.apply_classifier(pred, classifier_model, im, im0s)
# Process predictions
for i, det in enumerate(pred): # per image
seen += 1
if webcam: # batch_size >= 1
p, im0, frame = path[i], im0s[i].copy(), dataset.count
s += f'{i}: '
else:
p, im0, frame = path, im0s.copy(), getattr(dataset, 'frame', 0)
p = Path(p) # to Path
save_path = str(save_dir / p.name) # im.jpg
txt_path = str(save_dir / 'labels' / p.stem) + ('' if dataset.mode == 'image' else f'_{frame}') # im.txt
s += '%gx%g ' % im.shape[2:] # print string
gn = torch.tensor(im0.shape)[[1, 0, 1, 0]] # normalization gain whwh
imc = im0.copy() if save_crop else im0 # for save_crop
annotator = Annotator(im0, line_width=line_thickness, example=str(names))
if len(det):
# Rescale boxes from img_size to im0 size
det[:, :4] = scale_coords(im.shape[2:], det[:, :4], im0.shape).round()
# Print results
for c in det[:, -1].unique():
n = (det[:, -1] == c).sum() # detections per class
s += f"{n} {names[int(c)]}{'s' * (n > 1)}, " # add to string
# Write results
for *xyxy, conf, cls in reversed(det):
if save_txt: # Write to file
xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh
line = (cls, *xywh, conf) if save_conf else (cls, *xywh) # label format
with open(txt_path + '.txt', 'a') as f:
f.write(('%g ' * len(line)).rstrip() % line + '\n')
if save_img or save_crop or view_img: # Add bbox to image
c = int(cls) # integer class
label = None if hide_labels else (names[c] if hide_conf else f'{names[c]} {conf:.2f}')
annotator.box_label(xyxy, label, color=colors(c, True))
if save_crop:
save_one_box(xyxy, imc, file=save_dir / 'crops' / names[c] / f'{p.stem}.jpg', BGR=True)
# Print time (inference-only)
LOGGER.info(f'{s}Done. ({t3 - t2:.3f}s)')
# Stream results
im0 = annotator.result()
if view_img:
cv2.imshow(str(p), im0)
cv2.waitKey(1) # 1 millisecond
# Save results (image with detections)
if save_img:
if dataset.mode == 'image':
cv2.imwrite(save_path, im0)
else: # 'video' or 'stream'
if vid_path[i] != save_path: # new video
vid_path[i] = save_path
if isinstance(vid_writer[i], cv2.VideoWriter):
vid_writer[i].release() # release previous video writer
if vid_cap: # video
fps = vid_cap.get(cv2.CAP_PROP_FPS)
w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
else: # stream
fps, w, h = 30, im0.shape[1], im0.shape[0]
save_path += '.mp4'
vid_writer[i] = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (w, h))
vid_writer[i].write(im0)
# Print results
t = tuple(x / seen * 1E3 for x in dt) # speeds per image
LOGGER.info(f'Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {(1, 3, *imgsz)}' % t)
if save_txt or save_img:
s = f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}" if save_txt else ''
LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}{s}")
if update:
strip_optimizer(weights) # update model (to fix SourceChangeWarning)
def parse_opt():
parser = argparse.ArgumentParser()
parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'yolov5s.pt', help='model path(s)')
parser.add_argument('--source', type=str, default=ROOT / 'data/images', help='file/dir/URL/glob, 0 for webcam')
parser.add_argument('--imgsz', '--img', '--img-size', nargs='+', type=int, default=[640], help='inference size h,w')
parser.add_argument('--conf-thres', type=float, default=0.5, help='confidence threshold')
parser.add_argument('--iou-thres', type=float, default=0.45, help='NMS IoU threshold')
parser.add_argument('--max-det', type=int, default=1000, help='maximum detections per image')
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
parser.add_argument('--view-img', action='store_true', help='show results')
parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')
parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels')
parser.add_argument('--save-crop', action='store_true', help='save cropped prediction boxes')
parser.add_argument('--nosave', action='store_true', help='do not save images/videos')
parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --classes 0, or --classes 0 2 3')
parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS')
parser.add_argument('--augment', action='store_true', help='augmented inference')
parser.add_argument('--visualize', action='store_true', help='visualize features')
parser.add_argument('--update', action='store_true', help='update all models')
parser.add_argument('--project', default=ROOT / 'runs/detect', help='save results to project/name')
parser.add_argument('--name', default='exp', help='save results to project/name')
parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
parser.add_argument('--line-thickness', default=3, type=int, help='bounding box thickness (pixels)')
parser.add_argument('--hide-labels', default=False, action='store_true', help='hide labels')
parser.add_argument('--hide-conf', default=False, action='store_true', help='hide confidences')
parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference')
parser.add_argument('--dnn', action='store_true', help='use OpenCV DNN for ONNX inference')
opt = parser.parse_args()
opt.imgsz *= 2 if len(opt.imgsz) == 1 else 1 # expand
print_args(FILE.stem, opt)
return opt
def main(opt):
check_requirements(exclude=('tensorboard', 'thop'))
run(**vars(opt))
# 命令使用
# python detect.py --weights runs/train/exp_yolov5s/weights/best.pt --source data/images/fishman.jpg # webcam
if __name__ == "__main__":
opt = parse_opt()
main(opt)

@ -0,0 +1,61 @@
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Start FROM Nvidia PyTorch image https://ngc.nvidia.com/catalog/containers/nvidia:pytorch
FROM nvcr.io/nvidia/pytorch:21.10-py3
# Install linux packages
RUN apt update && apt install -y zip htop screen libgl1-mesa-glx
# Install python dependencies
COPY ../requirements.txt .
RUN python -m pip install --upgrade pip
RUN pip uninstall -y nvidia-tensorboard nvidia-tensorboard-plugin-dlprof
RUN pip install --no-cache -r requirements.txt coremltools onnx gsutil notebook wandb>=0.12.2
RUN pip install --no-cache -U torch torchvision numpy Pillow
# RUN pip install --no-cache torch==1.10.0+cu113 torchvision==0.11.1+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
# Create working directory
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
# Copy contents
COPY .. /usr/src/app
# Downloads to user config dir
ADD https://ultralytics.com/assets/Arial.ttf /root/.config/Ultralytics/
# Set environment variables
# ENV HOME=/usr/src/app
# Usage Examples -------------------------------------------------------------------------------------------------------
# Build and Push
# t=ultralytics/yolov5:latest && sudo docker build -t $t . && sudo docker push $t
# Pull and Run
# t=ultralytics/yolov5:latest && sudo docker pull $t && sudo docker run -it --ipc=host --gpus all $t
# Pull and Run with local directory access
# t=ultralytics/yolov5:latest && sudo docker pull $t && sudo docker run -it --ipc=host --gpus all -v "$(pwd)"/datasets:/usr/src/datasets $t
# Kill all
# sudo docker kill $(sudo docker ps -q)
# Kill all image-based
# sudo docker kill $(sudo docker ps -qa --filter ancestor=ultralytics/yolov5:latest)
# Bash into running container
# sudo docker exec -it 5a9b5863d93d bash
# Bash into stopped container
# id=$(sudo docker ps -qa) && sudo docker start $id && sudo docker exec -it $id bash
# Clean up
# docker system prune -a --volumes
# Update Ubuntu drivers
# https://www.maketecheasier.com/install-nvidia-drivers-ubuntu/
# DDP test
# python -m torch.distributed.run --nproc_per_node 2 --master_port 1 train.py --epochs 3

@ -0,0 +1,292 @@
<div align="center">
<p>
<a align="left" href="https://ultralytics.com/yolov5" target="_blank">
<img width="850" src="https://github.com/ultralytics/yolov5/releases/download/v1.0/splash.jpg"></a>
</p>
<br>
<div>
<a href="https://github.com/ultralytics/yolov5/actions"><img src="https://github.com/ultralytics/yolov5/workflows/CI%20CPU%20testing/badge.svg" alt="CI CPU testing"></a>
<a href="https://zenodo.org/badge/latestdoi/264818686"><img src="https://zenodo.org/badge/264818686.svg" alt="YOLOv5 Citation"></a>
<a href="https://hub.docker.com/r/ultralytics/yolov5"><img src="https://img.shields.io/docker/pulls/ultralytics/yolov5?logo=docker" alt="Docker Pulls"></a>
<br>
<a href="https://colab.research.google.com/github/ultralytics/yolov5/blob/master/tutorial.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a>
<a href="https://www.kaggle.com/ultralytics/yolov5"><img src="https://kaggle.com/static/images/open-in-kaggle.svg" alt="Open In Kaggle"></a>
<a href="https://join.slack.com/t/ultralytics/shared_invite/zt-w29ei8bp-jczz7QYUmDtgo6r6KcMIAg"><img src="https://img.shields.io/badge/Slack-Join_Forum-blue.svg?logo=slack" alt="Join Forum"></a>
</div>
<br>
<div align="center">
<a href="https://github.com/ultralytics">
<img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-social-github.png" width="2%"/>
</a>
<img width="2%" />
<a href="https://www.linkedin.com/company/ultralytics">
<img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-social-linkedin.png" width="2%"/>
</a>
<img width="2%" />
<a href="https://twitter.com/ultralytics">
<img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-social-twitter.png" width="2%"/>
</a>
<img width="2%" />
<a href="https://youtube.com/ultralytics">
<img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-social-youtube.png" width="2%"/>
</a>
<img width="2%" />
<a href="https://www.facebook.com/ultralytics">
<img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-social-facebook.png" width="2%"/>
</a>
<img width="2%" />
<a href="https://www.instagram.com/ultralytics/">
<img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-social-instagram.png" width="2%"/>
</a>
</div>
<br>
<p>
YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset, and represents <a href="https://ultralytics.com">Ultralytics</a>
open-source research into future vision AI methods, incorporating lessons learned and best practices evolved over thousands of hours of research and development.
</p>
<!--
<a align="center" href="https://ultralytics.com/yolov5" target="_blank">
<img width="800" src="https://github.com/ultralytics/yolov5/releases/download/v1.0/banner-api.png"></a>
-->
</div>
## <div align="center">Documentation</div>
See the [YOLOv5 Docs](https://docs.ultralytics.com) for full documentation on training, testing and deployment.
## <div align="center">Quick Start Examples</div>
<details open>
<summary>Install</summary>
[**Python>=3.6.0**](https://www.python.org/) is required with all
[requirements.txt](https://github.com/ultralytics/yolov5/blob/master/requirements.txt) installed including
[**PyTorch>=1.7**](https://pytorch.org/get-started/locally/):
<!-- $ sudo apt update && apt install -y libgl1-mesa-glx libsm6 libxext6 libxrender-dev -->
```bash
$ git clone https://github.com/ultralytics/yolov5
$ cd yolov5
$ pip install -r requirements.txt
```
</details>
<details open>
<summary>Inference</summary>
Inference with YOLOv5 and [PyTorch Hub](https://github.com/ultralytics/yolov5/issues/36). Models automatically download
from the [latest YOLOv5 release](https://github.com/ultralytics/yolov5/releases).
```python
import torch
# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s') # or yolov5m, yolov5l, yolov5x, custom
# Images
img = 'https://ultralytics.com/images/zidane.jpg' # or file, Path, PIL, OpenCV, numpy, list
# Inference
results = model(img)
# Results
results.print() # or .show(), .save(), .crop(), .pandas(), etc.
```
</details>
<details>
<summary>Inference with detect.py</summary>
`detect.py` runs inference on a variety of sources, downloading models automatically from
the [latest YOLOv5 release](https://github.com/ultralytics/yolov5/releases) and saving results to `runs/detect`.
```bash
$ python detect.py --source 0 # webcam
img.jpg # image
vid.mp4 # video
path/ # directory
path/*.jpg # glob
'https://youtu.be/Zgi9g1ksQHc' # YouTube
'rtsp://example.com/media.mp4' # RTSP, RTMP, HTTP stream
```
</details>
<details>
<summary>Training</summary>
Run commands below to reproduce results
on [COCO](https://github.com/ultralytics/yolov5/blob/master/data/scripts/get_coco.sh) dataset (dataset auto-downloads on
first use). Training times for YOLOv5s/m/l/x are 2/4/6/8 days on a single V100 (multi-GPU times faster). Use the
largest `--batch-size` your GPU allows (batch sizes shown for 16 GB devices).
```bash
$ python train.py --data coco.yaml --cfg yolov5s.yaml --weights '' --batch-size 64
yolov5m 40
yolov5l 24
yolov5x 16
```
<img width="800" src="https://user-images.githubusercontent.com/26833433/90222759-949d8800-ddc1-11ea-9fa1-1c97eed2b963.png">
</details>
<details open>
<summary>Tutorials</summary>
* [Train Custom Data](https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data)&nbsp; 🚀 RECOMMENDED
* [Tips for Best Training Results](https://github.com/ultralytics/yolov5/wiki/Tips-for-Best-Training-Results)&nbsp; ☘️
RECOMMENDED
* [Weights & Biases Logging](https://github.com/ultralytics/yolov5/issues/1289)&nbsp; 🌟 NEW
* [Roboflow for Datasets, Labeling, and Active Learning](https://github.com/ultralytics/yolov5/issues/4975)&nbsp; 🌟 NEW
* [Multi-GPU Training](https://github.com/ultralytics/yolov5/issues/475)
* [PyTorch Hub](https://github.com/ultralytics/yolov5/issues/36)&nbsp; ⭐ NEW
* [TorchScript, ONNX, CoreML Export](https://github.com/ultralytics/yolov5/issues/251) 🚀
* [Test-Time Augmentation (TTA)](https://github.com/ultralytics/yolov5/issues/303)
* [Model Ensembling](https://github.com/ultralytics/yolov5/issues/318)
* [Model Pruning/Sparsity](https://github.com/ultralytics/yolov5/issues/304)
* [Hyperparameter Evolution](https://github.com/ultralytics/yolov5/issues/607)
* [Transfer Learning with Frozen Layers](https://github.com/ultralytics/yolov5/issues/1314)&nbsp; ⭐ NEW
* [TensorRT Deployment](https://github.com/wang-xinyu/tensorrtx)
</details>
## <div align="center">Environments</div>
Get started in seconds with our verified environments. Click each icon below for details.
<div align="center">
<a href="https://colab.research.google.com/github/ultralytics/yolov5/blob/master/tutorial.ipynb">
<img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-colab-small.png" width="15%"/>
</a>
<a href="https://www.kaggle.com/ultralytics/yolov5">
<img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-kaggle-small.png" width="15%"/>
</a>
<a href="https://hub.docker.com/r/ultralytics/yolov5">
<img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-docker-small.png" width="15%"/>
</a>
<a href="https://github.com/ultralytics/yolov5/wiki/AWS-Quickstart">
<img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-aws-small.png" width="15%"/>
</a>
<a href="https://github.com/ultralytics/yolov5/wiki/GCP-Quickstart">
<img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-gcp-small.png" width="15%"/>
</a>
</div>
## <div align="center">Integrations</div>
<div align="center">
<a href="https://wandb.ai/site?utm_campaign=repo_yolo_readme">
<img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-wb-long.png" width="49%"/>
</a>
<a href="https://roboflow.com/?ref=ultralytics">
<img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-roboflow-long.png" width="49%"/>
</a>
</div>
|Weights and Biases|Roboflow ⭐ NEW|
|:-:|:-:|
|Automatically track and visualize all your YOLOv5 training runs in the cloud with [Weights & Biases](https://wandb.ai/site?utm_campaign=repo_yolo_readme)|Label and export your custom datasets directly to YOLOv5 for training with [Roboflow](https://roboflow.com/?ref=ultralytics) |
<!-- ## <div align="center">Compete and Win</div>
We are super excited about our first-ever Ultralytics YOLOv5 🚀 EXPORT Competition with **$10,000** in cash prizes!
<p align="center">
<a href="https://github.com/ultralytics/yolov5/discussions/3213">
<img width="850" src="https://github.com/ultralytics/yolov5/releases/download/v1.0/banner-export-competition.png"></a>
</p> -->
## <div align="center">Why YOLOv5</div>
<p align="left"><img width="800" src="https://user-images.githubusercontent.com/26833433/136901921-abcfcd9d-f978-4942-9b97-0e3f202907df.png"></p>
<details>
<summary>YOLOv5-P5 640 Figure (click to expand)</summary>
<p align="left"><img width="800" src="https://user-images.githubusercontent.com/26833433/136763877-b174052b-c12f-48d2-8bc4-545e3853398e.png"></p>
</details>
<details>
<summary>Figure Notes (click to expand)</summary>
* **COCO AP val** denotes mAP@0.5:0.95 metric measured on the 5000-image [COCO val2017](http://cocodataset.org) dataset over various inference sizes from 256 to 1536.
* **GPU Speed** measures average inference time per image on [COCO val2017](http://cocodataset.org) dataset using a [AWS p3.2xlarge](https://aws.amazon.com/ec2/instance-types/p3/) V100 instance at batch-size 32.
* **EfficientDet** data from [google/automl](https://github.com/google/automl) at batch size 8.
* **Reproduce** by `python val.py --task study --data coco.yaml --iou 0.7 --weights yolov5n6.pt yolov5s6.pt yolov5m6.pt yolov5l6.pt yolov5x6.pt`
</details>
### Pretrained Checkpoints
[assets]: https://github.com/ultralytics/yolov5/releases
[TTA]: https://github.com/ultralytics/yolov5/issues/303
|Model |size<br><sup>(pixels) |mAP<sup>val<br>0.5:0.95 |mAP<sup>val<br>0.5 |Speed<br><sup>CPU b1<br>(ms) |Speed<br><sup>V100 b1<br>(ms) |Speed<br><sup>V100 b32<br>(ms) |params<br><sup>(M) |FLOPs<br><sup>@640 (B)
|--- |--- |--- |--- |--- |--- |--- |--- |---
|[YOLOv5n][assets] |640 |28.4 |46.0 |**45** |**6.3**|**0.6**|**1.9**|**4.5**
|[YOLOv5s][assets] |640 |37.2 |56.0 |98 |6.4 |0.9 |7.2 |16.5
|[YOLOv5m][assets] |640 |45.2 |63.9 |224 |8.2 |1.7 |21.2 |49.0
|[YOLOv5l][assets] |640 |48.8 |67.2 |430 |10.1 |2.7 |46.5 |109.1
|[YOLOv5x][assets] |640 |50.7 |68.9 |766 |12.1 |4.8 |86.7 |205.7
| | | | | | | | |
|[YOLOv5n6][assets] |1280 |34.0 |50.7 |153 |8.1 |2.1 |3.2 |4.6
|[YOLOv5s6][assets] |1280 |44.5 |63.0 |385 |8.2 |3.6 |16.8 |12.6
|[YOLOv5m6][assets] |1280 |51.0 |69.0 |887 |11.1 |6.8 |35.7 |50.0
|[YOLOv5l6][assets] |1280 |53.6 |71.6 |1784 |15.8 |10.5 |76.8 |111.4
|[YOLOv5x6][assets]<br>+ [TTA][TTA]|1280<br>1536 |54.7<br>**55.4** |**72.4**<br>72.3 |3136<br>- |26.2<br>- |19.4<br>- |140.7<br>- |209.8<br>-
<details>
<summary>Table Notes (click to expand)</summary>
* All checkpoints are trained to 300 epochs with default settings and hyperparameters.
* **mAP<sup>val</sup>** values are for single-model single-scale on [COCO val2017](http://cocodataset.org) dataset.<br>Reproduce by `python val.py --data coco.yaml --img 640 --conf 0.001 --iou 0.65`
* **Speed** averaged over COCO val images using a [AWS p3.2xlarge](https://aws.amazon.com/ec2/instance-types/p3/) instance. NMS times (~1 ms/img) not included.<br>Reproduce by `python val.py --data coco.yaml --img 640 --conf 0.25 --iou 0.45`
* **TTA** [Test Time Augmentation](https://github.com/ultralytics/yolov5/issues/303) includes reflection and scale augmentations.<br>Reproduce by `python val.py --data coco.yaml --img 1536 --iou 0.7 --augment`
</details>
## <div align="center">Contribute</div>
We love your input! We want to make contributing to YOLOv5 as easy and transparent as possible. Please see our [Contributing Guide](CONTRIBUTING.md) to get started, and fill out the [YOLOv5 Survey](https://ultralytics.com/survey?utm_source=github&utm_medium=social&utm_campaign=Survey) to send us feedback on your experiences. Thank you to all our contributors!
<a href="https://github.com/ultralytics/yolov5/graphs/contributors"><img src="https://opencollective.com/ultralytics/contributors.svg?width=990" /></a>
## <div align="center">Contact</div>
For YOLOv5 bugs and feature requests please visit [GitHub Issues](https://github.com/ultralytics/yolov5/issues). For business inquiries or
professional support requests please visit [https://ultralytics.com/contact](https://ultralytics.com/contact).
<br>
<div align="center">
<a href="https://github.com/ultralytics">
<img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-social-github.png" width="3%"/>
</a>
<img width="3%" />
<a href="https://www.linkedin.com/company/ultralytics">
<img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-social-linkedin.png" width="3%"/>
</a>
<img width="3%" />
<a href="https://twitter.com/ultralytics">
<img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-social-twitter.png" width="3%"/>
</a>
<img width="3%" />
<a href="https://youtube.com/ultralytics">
<img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-social-youtube.png" width="3%"/>
</a>
<img width="3%" />
<a href="https://www.facebook.com/ultralytics">
<img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-social-facebook.png" width="3%"/>
</a>
<img width="3%" />
<a href="https://www.instagram.com/ultralytics/">
<img src="https://github.com/ultralytics/yolov5/releases/download/v1.0/logo-social-instagram.png" width="3%"/>
</a>
</div>

File diff suppressed because it is too large Load Diff

@ -0,0 +1,369 @@
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
"""
Export a YOLOv5 PyTorch model to TorchScript, ONNX, CoreML, TensorFlow (saved_model, pb, TFLite, TF.js,) formats
TensorFlow exports authored by https://github.com/zldrobit
Usage:
$ python path/to/export.py --weights yolov5s.pt --include torchscript onnx coreml saved_model pb tflite tfjs
Inference:
$ python path/to/detect.py --weights yolov5s.pt
yolov5s.onnx (must export with --dynamic)
yolov5s_saved_model
yolov5s.pb
yolov5s.tflite
TensorFlow.js:
$ cd .. && git clone https://github.com/zldrobit/tfjs-yolov5-example.git && cd tfjs-yolov5-example
$ npm install
$ ln -s ../../yolov5/yolov5s_web_model public/yolov5s_web_model
$ npm start
"""
import argparse
import json
import os
import subprocess
import sys
import time
from pathlib import Path
import torch
import torch.nn as nn
from torch.utils.mobile_optimizer import optimize_for_mobile
FILE = Path(__file__).resolve()
ROOT = FILE.parents[0] # YOLOv5 root directory
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT)) # add ROOT to PATH
ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative
from models.common import Conv
from models.experimental import attempt_load
from models.yolo import Detect
from utils.activations import SiLU
from utils.datasets import LoadImages
from utils.general import (LOGGER, check_dataset, check_img_size, check_requirements, colorstr, file_size, print_args,
url2file)
from utils.torch_utils import select_device
def export_torchscript(model, im, file, optimize, prefix=colorstr('TorchScript:')):
# YOLOv5 TorchScript model export
try:
LOGGER.info(f'\n{prefix} starting export with torch {torch.__version__}...')
f = file.with_suffix('.torchscript.pt')
ts = torch.jit.trace(model, im, strict=False)
d = {"shape": im.shape, "stride": int(max(model.stride)), "names": model.names}
extra_files = {'config.txt': json.dumps(d)} # torch._C.ExtraFilesMap()
(optimize_for_mobile(ts) if optimize else ts).save(f, _extra_files=extra_files)
LOGGER.info(f'{prefix} export success, saved as {f} ({file_size(f):.1f} MB)')
except Exception as e:
LOGGER.info(f'{prefix} export failure: {e}')
def export_onnx(model, im, file, opset, train, dynamic, simplify, prefix=colorstr('ONNX:')):
# YOLOv5 ONNX export
try:
check_requirements(('onnx',))
import onnx
LOGGER.info(f'\n{prefix} starting export with onnx {onnx.__version__}...')
f = file.with_suffix('.onnx')
torch.onnx.export(model, im, f, verbose=False, opset_version=opset,
training=torch.onnx.TrainingMode.TRAINING if train else torch.onnx.TrainingMode.EVAL,
do_constant_folding=not train,
input_names=['images'],
output_names=['output'],
dynamic_axes={'images': {0: 'batch', 2: 'height', 3: 'width'}, # shape(1,3,640,640)
'output': {0: 'batch', 1: 'anchors'} # shape(1,25200,85)
} if dynamic else None)
# Checks
model_onnx = onnx.load(f) # load onnx model
onnx.checker.check_model(model_onnx) # check onnx model
# LOGGER.info(onnx.helper.printable_graph(model_onnx.graph)) # print
# Simplify
if simplify:
try:
check_requirements(('onnx-simplifier',))
import onnxsim
LOGGER.info(f'{prefix} simplifying with onnx-simplifier {onnxsim.__version__}...')
model_onnx, check = onnxsim.simplify(
model_onnx,
dynamic_input_shape=dynamic,
input_shapes={'images': list(im.shape)} if dynamic else None)
assert check, 'assert check failed'
onnx.save(model_onnx, f)
except Exception as e:
LOGGER.info(f'{prefix} simplifier failure: {e}')
LOGGER.info(f'{prefix} export success, saved as {f} ({file_size(f):.1f} MB)')
LOGGER.info(f"{prefix} run --dynamic ONNX model inference with: 'python detect.py --weights {f}'")
except Exception as e:
LOGGER.info(f'{prefix} export failure: {e}')
def export_coreml(model, im, file, prefix=colorstr('CoreML:')):
# YOLOv5 CoreML export
ct_model = None
try:
check_requirements(('coremltools',))
import coremltools as ct
LOGGER.info(f'\n{prefix} starting export with coremltools {ct.__version__}...')
f = file.with_suffix('.mlmodel')
model.train() # CoreML exports should be placed in model.train() mode
ts = torch.jit.trace(model, im, strict=False) # TorchScript model
ct_model = ct.convert(ts, inputs=[ct.ImageType('image', shape=im.shape, scale=1 / 255, bias=[0, 0, 0])])
ct_model.save(f)
LOGGER.info(f'{prefix} export success, saved as {f} ({file_size(f):.1f} MB)')
except Exception as e:
LOGGER.info(f'\n{prefix} export failure: {e}')
return ct_model
def export_saved_model(model, im, file, dynamic,
tf_nms=False, agnostic_nms=False, topk_per_class=100, topk_all=100, iou_thres=0.45,
conf_thres=0.25, prefix=colorstr('TensorFlow saved_model:')):
# YOLOv5 TensorFlow saved_model export
keras_model = None
try:
import tensorflow as tf
from tensorflow import keras
from models.tf import TFDetect, TFModel
LOGGER.info(f'\n{prefix} starting export with tensorflow {tf.__version__}...')
f = str(file).replace('.pt', '_saved_model')
batch_size, ch, *imgsz = list(im.shape) # BCHW
tf_model = TFModel(cfg=model.yaml, model=model, nc=model.nc, imgsz=imgsz)
im = tf.zeros((batch_size, *imgsz, 3)) # BHWC order for TensorFlow
y = tf_model.predict(im, tf_nms, agnostic_nms, topk_per_class, topk_all, iou_thres, conf_thres)
inputs = keras.Input(shape=(*imgsz, 3), batch_size=None if dynamic else batch_size)
outputs = tf_model.predict(inputs, tf_nms, agnostic_nms, topk_per_class, topk_all, iou_thres, conf_thres)
keras_model = keras.Model(inputs=inputs, outputs=outputs)
keras_model.trainable = False
keras_model.summary()
keras_model.save(f, save_format='tf')
LOGGER.info(f'{prefix} export success, saved as {f} ({file_size(f):.1f} MB)')
except Exception as e:
LOGGER.info(f'\n{prefix} export failure: {e}')
return keras_model
def export_pb(keras_model, im, file, prefix=colorstr('TensorFlow GraphDef:')):
# YOLOv5 TensorFlow GraphDef *.pb export https://github.com/leimao/Frozen_Graph_TensorFlow
try:
import tensorflow as tf
from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2
LOGGER.info(f'\n{prefix} starting export with tensorflow {tf.__version__}...')
f = file.with_suffix('.pb')
m = tf.function(lambda x: keras_model(x)) # full model
m = m.get_concrete_function(tf.TensorSpec(keras_model.inputs[0].shape, keras_model.inputs[0].dtype))
frozen_func = convert_variables_to_constants_v2(m)
frozen_func.graph.as_graph_def()
tf.io.write_graph(graph_or_graph_def=frozen_func.graph, logdir=str(f.parent), name=f.name, as_text=False)
LOGGER.info(f'{prefix} export success, saved as {f} ({file_size(f):.1f} MB)')
except Exception as e:
LOGGER.info(f'\n{prefix} export failure: {e}')
def export_tflite(keras_model, im, file, int8, data, ncalib, prefix=colorstr('TensorFlow Lite:')):
# YOLOv5 TensorFlow Lite export
try:
import tensorflow as tf
from models.tf import representative_dataset_gen
LOGGER.info(f'\n{prefix} starting export with tensorflow {tf.__version__}...')
batch_size, ch, *imgsz = list(im.shape) # BCHW
f = str(file).replace('.pt', '-fp16.tflite')
converter = tf.lite.TFLiteConverter.from_keras_model(keras_model)
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS]
converter.target_spec.supported_types = [tf.float16]
converter.optimizations = [tf.lite.Optimize.DEFAULT]
if int8:
dataset = LoadImages(check_dataset(data)['train'], img_size=imgsz, auto=False) # representative data
converter.representative_dataset = lambda: representative_dataset_gen(dataset, ncalib)
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.target_spec.supported_types = []
converter.inference_input_type = tf.uint8 # or tf.int8
converter.inference_output_type = tf.uint8 # or tf.int8
converter.experimental_new_quantizer = False
f = str(file).replace('.pt', '-int8.tflite')
tflite_model = converter.convert()
open(f, "wb").write(tflite_model)
LOGGER.info(f'{prefix} export success, saved as {f} ({file_size(f):.1f} MB)')
except Exception as e:
LOGGER.info(f'\n{prefix} export failure: {e}')
def export_tfjs(keras_model, im, file, prefix=colorstr('TensorFlow.js:')):
# YOLOv5 TensorFlow.js export
try:
check_requirements(('tensorflowjs',))
import re
import tensorflowjs as tfjs
LOGGER.info(f'\n{prefix} starting export with tensorflowjs {tfjs.__version__}...')
f = str(file).replace('.pt', '_web_model') # js dir
f_pb = file.with_suffix('.pb') # *.pb path
f_json = f + '/model.json' # *.json path
cmd = f"tensorflowjs_converter --input_format=tf_frozen_model " \
f"--output_node_names='Identity,Identity_1,Identity_2,Identity_3' {f_pb} {f}"
subprocess.run(cmd, shell=True)
json = open(f_json).read()
with open(f_json, 'w') as j: # sort JSON Identity_* in ascending order
subst = re.sub(
r'{"outputs": {"Identity.?.?": {"name": "Identity.?.?"}, '
r'"Identity.?.?": {"name": "Identity.?.?"}, '
r'"Identity.?.?": {"name": "Identity.?.?"}, '
r'"Identity.?.?": {"name": "Identity.?.?"}}}',
r'{"outputs": {"Identity": {"name": "Identity"}, '
r'"Identity_1": {"name": "Identity_1"}, '
r'"Identity_2": {"name": "Identity_2"}, '
r'"Identity_3": {"name": "Identity_3"}}}',
json)
j.write(subst)
LOGGER.info(f'{prefix} export success, saved as {f} ({file_size(f):.1f} MB)')
except Exception as e:
LOGGER.info(f'\n{prefix} export failure: {e}')
@torch.no_grad()
def run(data=ROOT / 'data/coco128.yaml', # 'dataset.yaml path'
weights=ROOT / 'yolov5s.pt', # weights path
imgsz=(640, 640), # image (height, width)
batch_size=1, # batch size
device='cpu', # cuda device, i.e. 0 or 0,1,2,3 or cpu
include=('torchscript', 'onnx', 'coreml'), # include formats
half=False, # FP16 half-precision export
inplace=False, # set YOLOv5 Detect() inplace=True
train=False, # model.train() mode
optimize=False, # TorchScript: optimize for mobile
int8=False, # CoreML/TF INT8 quantization
dynamic=False, # ONNX/TF: dynamic axes
simplify=False, # ONNX: simplify model
opset=12, # ONNX: opset version
topk_per_class=100, # TF.js NMS: topk per class to keep
topk_all=100, # TF.js NMS: topk for all classes to keep
iou_thres=0.45, # TF.js NMS: IoU threshold
conf_thres=0.25 # TF.js NMS: confidence threshold
):
t = time.time()
include = [x.lower() for x in include]
tf_exports = list(x in include for x in ('saved_model', 'pb', 'tflite', 'tfjs')) # TensorFlow exports
imgsz *= 2 if len(imgsz) == 1 else 1 # expand
file = Path(url2file(weights) if str(weights).startswith(('http:/', 'https:/')) else weights)
# Load PyTorch model
device = select_device(device)
assert not (device.type == 'cpu' and half), '--half only compatible with GPU export, i.e. use --device 0'
model = attempt_load(weights, map_location=device, inplace=True, fuse=True) # load FP32 model
nc, names = model.nc, model.names # number of classes, class names
# Input
gs = int(max(model.stride)) # grid size (max stride)
imgsz = [check_img_size(x, gs) for x in imgsz] # verify img_size are gs-multiples
im = torch.zeros(batch_size, 3, *imgsz).to(device) # image size(1,3,320,192) BCHW iDetection
# Update model
if half:
im, model = im.half(), model.half() # to FP16
model.train() if train else model.eval() # training mode = no Detect() layer grid construction
for k, m in model.named_modules():
if isinstance(m, Conv): # assign export-friendly activations
if isinstance(m.act, nn.SiLU):
m.act = SiLU()
elif isinstance(m, Detect):
m.inplace = inplace
m.onnx_dynamic = dynamic
# m.forward = m.forward_export # assign forward (optional)
for _ in range(2):
y = model(im) # dry runs
LOGGER.info(f"\n{colorstr('PyTorch:')} starting from {file} ({file_size(file):.1f} MB)")
# Exports
if 'torchscript' in include:
export_torchscript(model, im, file, optimize)
if 'onnx' in include:
export_onnx(model, im, file, opset, train, dynamic, simplify)
if 'coreml' in include:
export_coreml(model, im, file)
# TensorFlow Exports
if any(tf_exports):
pb, tflite, tfjs = tf_exports[1:]
assert not (tflite and tfjs), 'TFLite and TF.js models must be exported separately, please pass only one type.'
model = export_saved_model(model, im, file, dynamic, tf_nms=tfjs, agnostic_nms=tfjs,
topk_per_class=topk_per_class, topk_all=topk_all, conf_thres=conf_thres,
iou_thres=iou_thres) # keras model
if pb or tfjs: # pb prerequisite to tfjs
export_pb(model, im, file)
if tflite:
export_tflite(model, im, file, int8=int8, data=data, ncalib=100)
if tfjs:
export_tfjs(model, im, file)
# Finish
LOGGER.info(f'\nExport complete ({time.time() - t:.2f}s)'
f"\nResults saved to {colorstr('bold', file.parent.resolve())}"
f'\nVisualize with https://netron.app')
def parse_opt():
parser = argparse.ArgumentParser()
parser.add_argument('--data', type=str, default=ROOT / 'data/coco128.yaml', help='dataset.yaml path')
parser.add_argument('--weights', type=str, default=ROOT / 'yolov5s.pt', help='weights path')
parser.add_argument('--imgsz', '--img', '--img-size', nargs='+', type=int, default=[640, 640], help='image (h, w)')
parser.add_argument('--batch-size', type=int, default=1, help='batch size')
parser.add_argument('--device', default='cpu', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
parser.add_argument('--half', action='store_true', help='FP16 half-precision export')
parser.add_argument('--inplace', action='store_true', help='set YOLOv5 Detect() inplace=True')
parser.add_argument('--train', action='store_true', help='model.train() mode')
parser.add_argument('--optimize', action='store_true', help='TorchScript: optimize for mobile')
parser.add_argument('--int8', action='store_true', help='CoreML/TF INT8 quantization')
parser.add_argument('--dynamic', action='store_true', help='ONNX/TF: dynamic axes')
parser.add_argument('--simplify', action='store_true', help='ONNX: simplify model')
parser.add_argument('--opset', type=int, default=13, help='ONNX: opset version')
parser.add_argument('--topk-per-class', type=int, default=100, help='TF.js NMS: topk per class to keep')
parser.add_argument('--topk-all', type=int, default=100, help='TF.js NMS: topk for all classes to keep')
parser.add_argument('--iou-thres', type=float, default=0.45, help='TF.js NMS: IoU threshold')
parser.add_argument('--conf-thres', type=float, default=0.25, help='TF.js NMS: confidence threshold')
parser.add_argument('--include', nargs='+',
default=['torchscript', 'onnx'],
help='available formats are (torchscript, onnx, coreml, saved_model, pb, tflite, tfjs)')
opt = parser.parse_args()
print_args(FILE.stem, opt)
return opt
def main(opt):
run(**vars(opt))
if __name__ == "__main__":
opt = parse_opt()
main(opt)

@ -0,0 +1,142 @@
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
"""
PyTorch Hub models https://pytorch.org/hub/ultralytics_yolov5/
Usage:
import torch
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')
"""
import torch
def _create(name, pretrained=True, channels=3, classes=80, autoshape=True, verbose=True, device=None):
"""Creates a specified YOLOv5 model
Arguments:
name (str): name of model, i.e. 'yolov5s'
pretrained (bool): load pretrained weights into the model
channels (int): number of input channels
classes (int): number of model classes
autoshape (bool): apply YOLOv5 .autoshape() wrapper to model
verbose (bool): print all information to screen
device (str, torch.device, None): device to use for model parameters
Returns:
YOLOv5 pytorch model
"""
from pathlib import Path
from models.experimental import attempt_load
from models.yolo import Model
from utils.downloads import attempt_download
from utils.general import check_requirements, intersect_dicts, set_logging
from utils.torch_utils import select_device
file = Path(__file__).resolve()
check_requirements(exclude=('tensorboard', 'thop', 'opencv-python'))
set_logging(verbose=verbose)
save_dir = Path('') if str(name).endswith('.pt') else file.parent
path = (save_dir / name).with_suffix('.pt') # checkpoint path
try:
device = select_device(('0' if torch.cuda.is_available() else 'cpu') if device is None else device)
if pretrained and channels == 3 and classes == 80:
model = attempt_load(path, map_location=device) # download/load FP32 model
else:
cfg = list((Path(__file__).parent / 'models').rglob(f'{name}.yaml'))[0] # model.yaml path
model = Model(cfg, channels, classes) # create model
if pretrained:
ckpt = torch.load(attempt_download(path), map_location=device) # load
csd = ckpt['model'].float().state_dict() # checkpoint state_dict as FP32
csd = intersect_dicts(csd, model.state_dict(), exclude=['anchors']) # intersect
model.load_state_dict(csd, strict=False) # load
if len(ckpt['model'].names) == classes:
model.names = ckpt['model'].names # set class names attribute
if autoshape:
model = model.autoshape() # for file/URI/PIL/cv2/np inputs and NMS
return model.to(device)
except Exception as e:
help_url = 'https://github.com/ultralytics/yolov5/issues/36'
s = 'Cache may be out of date, try `force_reload=True`. See %s for help.' % help_url
raise Exception(s) from e
def custom(path='path/to/model.pt', autoshape=True, verbose=True, device=None):
# YOLOv5 custom or local model
return _create(path, autoshape=autoshape, verbose=verbose, device=device)
def yolov5n(pretrained=True, channels=3, classes=80, autoshape=True, verbose=True, device=None):
# YOLOv5-nano model https://github.com/ultralytics/yolov5
return _create('yolov5n', pretrained, channels, classes, autoshape, verbose, device)
def yolov5s(pretrained=True, channels=3, classes=80, autoshape=True, verbose=True, device=None):
# YOLOv5-small model https://github.com/ultralytics/yolov5
return _create('yolov5s', pretrained, channels, classes, autoshape, verbose, device)
def yolov5m(pretrained=True, channels=3, classes=80, autoshape=True, verbose=True, device=None):
# YOLOv5-medium model https://github.com/ultralytics/yolov5
return _create('yolov5m', pretrained, channels, classes, autoshape, verbose, device)
def yolov5l(pretrained=True, channels=3, classes=80, autoshape=True, verbose=True, device=None):
# YOLOv5-large model https://github.com/ultralytics/yolov5
return _create('yolov5l', pretrained, channels, classes, autoshape, verbose, device)
def yolov5x(pretrained=True, channels=3, classes=80, autoshape=True, verbose=True, device=None):
# YOLOv5-xlarge model https://github.com/ultralytics/yolov5
return _create('yolov5x', pretrained, channels, classes, autoshape, verbose, device)
def yolov5n6(pretrained=True, channels=3, classes=80, autoshape=True, verbose=True, device=None):
# YOLOv5-nano-P6 model https://github.com/ultralytics/yolov5
return _create('yolov5n6', pretrained, channels, classes, autoshape, verbose, device)
def yolov5s6(pretrained=True, channels=3, classes=80, autoshape=True, verbose=True, device=None):
# YOLOv5-small-P6 model https://github.com/ultralytics/yolov5
return _create('yolov5s6', pretrained, channels, classes, autoshape, verbose, device)
def yolov5m6(pretrained=True, channels=3, classes=80, autoshape=True, verbose=True, device=None):
# YOLOv5-medium-P6 model https://github.com/ultralytics/yolov5
return _create('yolov5m6', pretrained, channels, classes, autoshape, verbose, device)
def yolov5l6(pretrained=True, channels=3, classes=80, autoshape=True, verbose=True, device=None):
# YOLOv5-large-P6 model https://github.com/ultralytics/yolov5
return _create('yolov5l6', pretrained, channels, classes, autoshape, verbose, device)
def yolov5x6(pretrained=True, channels=3, classes=80, autoshape=True, verbose=True, device=None):
# YOLOv5-xlarge-P6 model https://github.com/ultralytics/yolov5
return _create('yolov5x6', pretrained, channels, classes, autoshape, verbose, device)
if __name__ == '__main__':
model = _create(name='yolov5s', pretrained=True, channels=3, classes=80, autoshape=True, verbose=True) # pretrained
# model = custom(path='path/to/model.pt') # custom
# Verify inference
from pathlib import Path
import cv2
import numpy as np
from PIL import Image
imgs = ['data/images/zidane.jpg', # filename
Path('data/images/zidane.jpg'), # Path
'https://ultralytics.com/images/zidane.jpg', # URI
cv2.imread('data/images/bus.jpg')[:, :, ::-1], # OpenCV
Image.open('data/images/bus.jpg'), # PIL
np.zeros((320, 640, 3))] # numpy
results = model(imgs) # batched inference
results.print()
results.save()

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 216 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 151 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 27 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 90 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 54 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 59 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.6 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 88 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.4 KiB

@ -0,0 +1,2 @@
主要实现每1s读取一次img目录中的图片并且识别的图片中的文字会存储到txt目录中
另外txt目录中设置了当存储的备份文件即历史识别后结构仅仅保留最后1m以来做记录器。

@ -0,0 +1,98 @@
import pytesseract
from PIL import Image
import sys
import os
import time
import datetime
def ensure_directories():
# 确保存储输出文本的目录存在
output_dir = 'txt'
if not os.path.exists(output_dir):
os.makedirs(output_dir)
def ensure_pytesseract_installed():
try:
# 尝试导入pytesseract以确认它已安装
import pytesseract
except ImportError:
print("pytesseract库未安装。请运行'pip install pytesseract'来安装它。")
sys.exit(1)
def ensure_tesseract_executable_configured():
try:
# 尝试获取Tesseract的路径以确认它已配置
pytesseract.pytesseract.tesseract_cmd
except AttributeError:
print("未配置Tesseract可执行文件的路径。请在pytesseract中设置tesseract_cmd。")
sys.exit(1)
def image_to_text(image_path):
try:
# 打开图像文件
img = Image.open(image_path)
except IOError:
print(f"无法打开图像文件:{image_path}")
return None
try:
# 使用pytesseract进行文字识别
text = pytesseract.image_to_string(img, lang='eng')
except Exception as e:
print(f"文字识别过程中发生错误:{e}")
return None
return text
def main():
# 确保pytesseract库已安装
ensure_pytesseract_installed()
# 确保配置了Tesseract可执行文件的路径
ensure_tesseract_executable_configured()
# 确保输出目录存在
ensure_directories()
# 设置输出目录路径
output_dir_path = 'txt'
# 主循环每3秒处理一张图片
while True:
# 获取当前时间,并格式化为文件名
current_time = datetime.datetime.now().strftime("%Y%m%d%H%M%S")
image_filename = f"{current_time}.jpg"
image_path = os.path.join('img', image_filename)
output_file_path = os.path.join(output_dir_path, f"{current_time}.txt")
# 检查图片是否存在
if not os.path.exists(image_path):
print(f"图片 {image_path} 不存在,等待下一张...")
time.sleep(1)
continue
# 调用image_to_text函数并将结果写入文件
text = image_to_text(image_path)
if text:
with open(output_file_path, 'w', encoding='utf-8') as file:
file.write(text)
print(f"图片 {image_filename} 的识别结果已保存到 {output_file_path}")
else:
print(f"无法识别图片 {image_filename} 中的文字。")
# 检查txt文件夹中的文件数量如果超过60个删除最早的文件
files = os.listdir(output_dir_path)
if len(files) > 60:
# 对文件进行排序,以便找到最早的文件
files.sort()
oldest_file_path = os.path.join(output_dir_path, files[0])
os.remove(oldest_file_path)
print(f"已删除最早的文件:{oldest_file_path}")
# 等待1秒
time.sleep(1)
if __name__ == "__main__":
main()

@ -0,0 +1,242 @@
2024年 07月 01日 星期一 21:13:13 CST: wenzhi.py 被停止, PID: 29387
2024年 07月 01日 星期一 21:13:13 CST: tts.py 运行失败,退出状态码 1, PID: 29405
2024年 07月 01日 星期一 21:13:18 CST: wenzhi.py 被停止, PID: 29408
2024年 07月 01日 星期一 21:13:18 CST: tts.py 运行失败,退出状态码 1, PID: 29435
2024年 07月 01日 星期一 21:13:23 CST: wenzhi.py 被停止, PID: 29456
2024年 07月 01日 星期一 21:13:23 CST: tts.py 运行失败,退出状态码 1, PID: 29474
2024年 07月 01日 星期一 21:14:26 CST: wenzhi.py 被停止, PID: 29817
2024年 07月 01日 星期一 21:14:26 CST: tts.py 运行失败,退出状态码 1, PID: 29835
2024年 07月 01日 星期一 21:14:31 CST: wenzhi.py 被停止, PID: 29838
2024年 07月 01日 星期一 21:14:31 CST: tts.py 运行失败,退出状态码 1, PID: 29865
2024年 07月 01日 星期一 21:16:08 CST: wenzhi.py 被停止, PID: 30373
2024年 07月 01日 星期一 21:16:08 CST: tts.py 运行失败,退出状态码 1, PID: 30391
2024年 07月 01日 星期一 21:16:13 CST: wenzhi.py 被停止, PID: 30412
2024年 07月 01日 星期一 21:16:13 CST: tts.py 运行失败,退出状态码 1, PID: 30466
2024年 07月 01日 星期一 21:16:18 CST: wenzhi.py 被停止, PID: 30469
2024年 07月 01日 星期一 21:16:18 CST: tts.py 运行失败,退出状态码 1, PID: 30550
2024年 07月 01日 星期一 21:16:42 CST: wenzhi.py 被停止, PID: 30664
2024年 07月 01日 星期一 21:16:42 CST: tts.py 运行失败,退出状态码 1, PID: 30691
2024年 07月 01日 星期一 21:16:47 CST: wenzhi.py 被停止, PID: 30694
2024年 07月 01日 星期一 21:16:47 CST: tts.py 运行失败,退出状态码 1, PID: 30721
2024年 07月 01日 星期一 21:16:52 CST: wenzhi.py 被停止, PID: 30733
2024年 07月 01日 星期一 21:16:52 CST: tts.py 运行失败,退出状态码 1, PID: 30769
2024年 07月 01日 星期一 21:16:57 CST: wenzhi.py 被停止, PID: 30781
2024年 07月 01日 星期一 21:16:57 CST: tts.py 运行失败,退出状态码 1, PID: 30835
2024年 07月 01日 星期一 21:17:02 CST: wenzhi.py 被停止, PID: 30838
2024年 07月 01日 星期一 21:17:02 CST: tts.py 运行失败,退出状态码 1, PID: 30895
2024年 07月 01日 星期一 21:19:05 CST: wenzhi.py 被停止, PID: 31525
2024年 07月 01日 星期一 21:19:05 CST: tts.py 运行失败,退出状态码 1, PID: 31543
2024年 07月 01日 星期一 21:19:10 CST: wenzhi.py 被停止, PID: 31546
2024年 07月 01日 星期一 21:19:10 CST: tts.py 运行失败,退出状态码 1, PID: 31573
2024年 07月 01日 星期一 21:20:16 CST: wenzhi.py 被停止, PID: 31924
2024年 07月 01日 星期一 21:20:17 CST: tts.py 运行失败,退出状态码 1, PID: 31960
2024年 07月 01日 星期一 21:20:22 CST: wenzhi.py 被停止, PID: 31981
2024年 07月 01日 星期一 21:20:22 CST: tts.py 运行失败,退出状态码 1, PID: 32008
2024年 07月 01日 星期一 21:20:27 CST: wenzhi.py 被停止, PID: 32029
2024年 07月 01日 星期一 21:20:27 CST: tts.py 运行失败,退出状态码 1, PID: 32056
2024年 07月 01日 星期一 21:20:32 CST: wenzhi.py 被停止, PID: 32059
2024年 07月 01日 星期一 21:20:32 CST: tts.py 运行失败,退出状态码 1, PID: 32086
2024年 07月 01日 星期一 21:20:37 CST: wenzhi.py 被停止, PID: 32125
2024年 07月 01日 星期一 21:20:37 CST: tts.py 运行失败,退出状态码 1, PID: 32161
2024年 07月 01日 星期一 21:20:42 CST: wenzhi.py 被停止, PID: 32191
2024年 07月 01日 星期一 21:20:42 CST: tts.py 运行失败,退出状态码 1, PID: 32264
2024年 07月 01日 星期一 21:20:47 CST: wenzhi.py 被停止, PID: 32267
2024年 07月 01日 星期一 21:20:47 CST: tts.py 运行失败,退出状态码 1, PID: 32303
2024年 07月 01日 星期一 21:20:52 CST: wenzhi.py 被停止, PID: 32306
2024年 07月 01日 星期一 21:20:52 CST: tts.py 运行失败,退出状态码 1, PID: 32351
2024年 07月 01日 星期一 21:20:57 CST: wenzhi.py 被停止, PID: 32354
2024年 07月 01日 星期一 21:20:57 CST: tts.py 运行失败,退出状态码 1, PID: 32426
2024年 07月 01日 星期一 23:05:26 CST: wenzhi.py 被停止, PID: 8301
2024年 07月 01日 星期一 23:05:26 CST: tts.py 运行失败,退出状态码 1, PID: 8318
2024年 07月 01日 星期一 23:05:28 CST: wenzhi.py 被停止, PID: 8322
2024年 07月 01日 星期一 23:05:28 CST: tts.py 运行失败,退出状态码 1, PID: 8339
2024年 07月 01日 星期一 23:05:30 CST: wenzhi.py 被停止, PID: 8352
2024年 07月 01日 星期一 23:05:30 CST: tts.py 运行失败,退出状态码 1, PID: 8369
2024年 07月 01日 星期一 23:05:32 CST: wenzhi.py 被停止, PID: 8373
2024年 07月 01日 星期一 23:05:32 CST: tts.py 运行失败,退出状态码 1, PID: 8400
2024年 07月 01日 星期一 23:05:34 CST: wenzhi.py 被停止, PID: 8422
2024年 07月 01日 星期一 23:05:34 CST: tts.py 运行失败,退出状态码 1, PID: 8448
2024年 07月 01日 星期一 23:09:24 CST: wenzhi.py 被停止, PID: 10042
2024年 07月 01日 星期一 23:09:24 CST: tts.py 运行失败,退出状态码 1, PID: 10069
2024年 07月 01日 星期一 23:09:26 CST: wenzhi.py 被停止, PID: 10073
2024年 07月 01日 星期一 23:09:27 CST: tts.py 运行失败,退出状态码 1, PID: 10090
2024年 07月 01日 星期一 23:09:29 CST: wenzhi.py 被停止, PID: 10121
2024年 07月 01日 星期一 23:09:29 CST: tts.py 运行失败,退出状态码 1, PID: 10129
2024年 07月 01日 星期一 23:09:31 CST: wenzhi.py 被停止, PID: 10178
2024年 07月 01日 星期一 23:09:31 CST: tts.py 运行失败,退出状态码 1, PID: 10195
2024年 07月 01日 星期一 23:09:33 CST: wenzhi.py 被停止, PID: 10208
2024年 07月 01日 星期一 23:09:33 CST: tts.py 运行失败,退出状态码 1, PID: 10225
2024年 07月 01日 星期一 23:09:35 CST: wenzhi.py 被停止, PID: 10229
2024年 07月 01日 星期一 23:09:35 CST: tts.py 运行失败,退出状态码 1, PID: 10237
2024年 07月 01日 星期一 23:09:37 CST: wenzhi.py 被停止, PID: 10241
2024年 07月 01日 星期一 23:09:37 CST: tts.py 运行失败,退出状态码 1, PID: 10249
2024年 07月 01日 星期一 23:09:39 CST: wenzhi.py 被停止, PID: 10289
2024年 07月 01日 星期一 23:09:39 CST: tts.py 运行失败,退出状态码 1, PID: 10306
2024年 07月 01日 星期一 23:09:41 CST: wenzhi.py 被停止, PID: 10310
2024年 07月 01日 星期一 23:09:41 CST: tts.py 运行失败,退出状态码 1, PID: 10327
2024年 07月 01日 星期一 23:09:43 CST: wenzhi.py 被停止, PID: 10340
2024年 07月 01日 星期一 23:09:43 CST: tts.py 运行失败,退出状态码 1, PID: 10357
2024年 07月 01日 星期一 23:09:45 CST: wenzhi.py 被停止, PID: 10379
2024年 07月 01日 星期一 23:09:45 CST: tts.py 运行失败,退出状态码 1, PID: 10396
2024年 07月 01日 星期一 23:13:21 CST: wenzhi.py 被停止, PID: 11686
2024年 07月 01日 星期一 23:13:21 CST: tts.py 运行成功, PID: 11694
2024年 07月 01日 星期一 23:13:23 CST: wenzhi.py 被停止, PID: 11698
2024年 07月 01日 星期一 23:13:23 CST: tts.py 运行成功, PID: 11733
2024年 07月 01日 星期一 23:14:23 CST: wenzhi.py 被停止, PID: 12133
2024年 07月 01日 星期一 23:14:23 CST: tts.py 运行成功, PID: 12141
2024年 07月 01日 星期一 23:14:25 CST: wenzhi.py 被停止, PID: 12154
2024年 07月 01日 星期一 23:14:25 CST: tts.py 运行成功, PID: 12180
2024年 07月 01日 星期一 23:16:13 CST: wenzhi.py 被停止, PID: 12910
2024年 07月 01日 星期一 23:16:13 CST: tts.py 运行成功, PID: 12918
2024年 07月 01日 星期一 23:16:15 CST: wenzhi.py 被停止, PID: 12940
2024年 07月 01日 星期一 23:16:15 CST: tts.py 运行成功, PID: 12957
2024年 07月 01日 星期一 23:16:17 CST: wenzhi.py 被停止, PID: 12961
2024年 07月 01日 星期一 23:16:17 CST: tts.py 运行成功, PID: 12978
2024年 07月 01日 星期一 23:16:19 CST: wenzhi.py 被停止, PID: 12982
2024年 07月 01日 星期一 23:16:19 CST: tts.py 运行成功, PID: 12990
2024年 07月 01日 星期一 23:16:21 CST: wenzhi.py 被停止, PID: 12994
2024年 07月 01日 星期一 23:16:21 CST: tts.py 运行成功, PID: 13002
2024年 07月 01日 星期一 23:16:23 CST: wenzhi.py 被停止, PID: 13024
2024年 07月 01日 星期一 23:16:23 CST: tts.py 运行成功, PID: 13041
2024年 07月 01日 星期一 23:16:25 CST: wenzhi.py 被停止, PID: 13054
2024年 07月 01日 星期一 23:16:25 CST: tts.py 运行成功, PID: 13062
2024年 07月 01日 星期一 23:16:27 CST: wenzhi.py 被停止, PID: 13075
2024年 07月 01日 星期一 23:16:27 CST: tts.py 运行成功, PID: 13083
2024年 07月 01日 星期一 23:17:10 CST: wenzhi.py 被停止, PID: 13388
2024年 07月 01日 星期一 23:17:10 CST: tts.py 运行成功, PID: 13396
2024年 07月 01日 星期一 23:17:12 CST: wenzhi.py 被停止, PID: 13400
2024年 07月 01日 星期一 23:17:12 CST: tts.py 运行成功, PID: 13408
2024年 07月 01日 星期一 23:17:23 CST: wenzhi.py 被停止, PID: 13502
2024年 07月 01日 星期一 23:17:23 CST: tts.py 运行成功, PID: 13510
2024年 07月 01日 星期一 23:17:25 CST: wenzhi.py 被停止, PID: 13514
2024年 07月 01日 星期一 23:17:25 CST: tts.py 运行成功, PID: 13522
2024年 07月 01日 星期一 23:17:27 CST: wenzhi.py 被停止, PID: 13526
2024年 07月 01日 星期一 23:17:27 CST: tts.py 运行成功, PID: 13534
2024年 07月 01日 星期一 23:17:59 CST: wenzhi.py 被停止, PID: 13762
2024年 07月 01日 星期一 23:17:59 CST: tts.py 运行失败,退出状态码 1, PID: 13779
2024年 07月 01日 星期一 23:18:01 CST: wenzhi.py 被停止, PID: 13783
2024年 07月 01日 星期一 23:18:01 CST: tts.py 运行失败,退出状态码 1, PID: 13791
2024年 07月 01日 星期一 23:18:03 CST: wenzhi.py 被停止, PID: 13795
2024年 07月 01日 星期一 23:18:03 CST: tts.py 运行失败,退出状态码 1, PID: 13821
2024年 07月 01日 星期一 23:18:05 CST: wenzhi.py 被停止, PID: 13825
2024年 07月 01日 星期一 23:18:05 CST: tts.py 运行失败,退出状态码 1, PID: 13833
2024年 07月 01日 星期一 23:18:07 CST: wenzhi.py 被停止, PID: 13837
2024年 07月 01日 星期一 23:18:07 CST: tts.py 运行失败,退出状态码 1, PID: 13854
2024年 07月 01日 星期一 23:18:09 CST: wenzhi.py 被停止, PID: 13867
2024年 07月 01日 星期一 23:18:09 CST: tts.py 运行失败,退出状态码 1, PID: 13875
2024年 07月 01日 星期一 23:18:11 CST: wenzhi.py 被停止, PID: 13879
2024年 07月 01日 星期一 23:18:11 CST: tts.py 运行失败,退出状态码 1, PID: 13887
2024年 07月 01日 星期一 23:18:13 CST: wenzhi.py 被停止, PID: 13891
2024年 07月 01日 星期一 23:18:13 CST: tts.py 运行失败,退出状态码 1, PID: 13899
2024年 07月 01日 星期一 23:18:15 CST: wenzhi.py 被停止, PID: 13903
2024年 07月 01日 星期一 23:18:15 CST: tts.py 运行失败,退出状态码 1, PID: 13911
2024年 07月 01日 星期一 23:18:17 CST: wenzhi.py 被停止, PID: 13915
2024年 07月 01日 星期一 23:18:18 CST: tts.py 运行失败,退出状态码 1, PID: 13923
2024年 07月 01日 星期一 23:18:20 CST: wenzhi.py 被停止, PID: 13927
2024年 07月 01日 星期一 23:18:20 CST: tts.py 运行失败,退出状态码 1, PID: 13935
2024年 07月 01日 星期一 23:18:22 CST: wenzhi.py 被停止, PID: 13939
2024年 07月 01日 星期一 23:18:22 CST: tts.py 运行失败,退出状态码 1, PID: 13947
2024年 07月 01日 星期一 23:18:24 CST: wenzhi.py 被停止, PID: 13960
2024年 07月 01日 星期一 23:18:24 CST: tts.py 运行失败,退出状态码 1, PID: 13968
2024年 07月 01日 星期一 23:18:26 CST: wenzhi.py 被停止, PID: 13972
2024年 07月 01日 星期一 23:18:26 CST: tts.py 运行失败,退出状态码 1, PID: 13980
2024年 07月 01日 星期一 23:18:28 CST: wenzhi.py 被停止, PID: 13984
2024年 07月 01日 星期一 23:18:28 CST: tts.py 运行失败,退出状态码 1, PID: 13992
2024年 07月 01日 星期一 23:18:30 CST: wenzhi.py 被停止, PID: 13996
2024年 07月 01日 星期一 23:18:30 CST: tts.py 运行失败,退出状态码 1, PID: 14004
2024年 07月 01日 星期一 23:18:32 CST: wenzhi.py 被停止, PID: 14017
2024年 07月 01日 星期一 23:18:32 CST: tts.py 运行失败,退出状态码 1, PID: 14025
2024年 07月 01日 星期一 23:18:34 CST: wenzhi.py 被停止, PID: 14029
2024年 07月 01日 星期一 23:18:34 CST: tts.py 运行失败,退出状态码 1, PID: 14046
2024年 07月 01日 星期一 23:18:36 CST: wenzhi.py 被停止, PID: 14050
2024年 07月 01日 星期一 23:18:36 CST: tts.py 运行失败,退出状态码 1, PID: 14076
2024年 07月 01日 星期一 23:18:38 CST: wenzhi.py 被停止, PID: 14080
2024年 07月 01日 星期一 23:18:38 CST: tts.py 运行失败,退出状态码 1, PID: 14088
2024年 07月 01日 星期一 23:18:40 CST: wenzhi.py 被停止, PID: 14092
2024年 07月 01日 星期一 23:18:40 CST: tts.py 运行失败,退出状态码 1, PID: 14109
2024年 07月 01日 星期一 23:18:42 CST: wenzhi.py 被停止, PID: 14113
2024年 07月 01日 星期一 23:18:42 CST: tts.py 运行失败,退出状态码 1, PID: 14130
2024年 07月 01日 星期一 23:18:44 CST: wenzhi.py 被停止, PID: 14134
2024年 07月 01日 星期一 23:18:44 CST: tts.py 运行失败,退出状态码 1, PID: 14142
2024年 07月 01日 星期一 23:18:46 CST: wenzhi.py 被停止, PID: 14146
2024年 07月 01日 星期一 23:18:46 CST: tts.py 运行失败,退出状态码 1, PID: 14154
2024年 07月 01日 星期一 23:18:48 CST: wenzhi.py 被停止, PID: 14167
2024年 07月 01日 星期一 23:18:48 CST: tts.py 运行失败,退出状态码 1, PID: 14184
2024年 07月 01日 星期一 23:18:50 CST: wenzhi.py 被停止, PID: 14197
2024年 07月 01日 星期一 23:18:50 CST: tts.py 运行失败,退出状态码 1, PID: 14223
2024年 07月 01日 星期一 23:18:52 CST: wenzhi.py 被停止, PID: 14227
2024年 07月 01日 星期一 23:18:52 CST: tts.py 运行失败,退出状态码 1, PID: 14235
2024年 07月 01日 星期一 23:18:54 CST: wenzhi.py 被停止, PID: 14248
2024年 07月 01日 星期一 23:18:54 CST: tts.py 运行失败,退出状态码 1, PID: 14282
2024年 07月 01日 星期一 23:18:56 CST: wenzhi.py 被停止, PID: 14296
2024年 07月 01日 星期一 23:18:56 CST: tts.py 运行失败,退出状态码 1, PID: 14313
2024年 07月 01日 星期一 23:18:58 CST: wenzhi.py 被停止, PID: 14334
2024年 07月 01日 星期一 23:18:58 CST: tts.py 运行失败,退出状态码 1, PID: 14361
2024年 07月 01日 星期一 23:19:00 CST: wenzhi.py 被停止, PID: 14374
2024年 07月 01日 星期一 23:19:00 CST: tts.py 运行失败,退出状态码 1, PID: 14391
2024年 07月 01日 星期一 23:19:02 CST: wenzhi.py 被停止, PID: 14395
2024年 07月 01日 星期一 23:19:02 CST: tts.py 运行失败,退出状态码 1, PID: 14403
2024年 07月 01日 星期一 23:19:04 CST: wenzhi.py 被停止, PID: 14425
2024年 07月 01日 星期一 23:19:04 CST: tts.py 运行失败,退出状态码 1, PID: 14433
2024年 07月 01日 星期一 23:19:06 CST: wenzhi.py 被停止, PID: 14446
2024年 07月 01日 星期一 23:19:07 CST: tts.py 运行失败,退出状态码 1, PID: 14463
2024年 07月 01日 星期一 23:19:09 CST: wenzhi.py 被停止, PID: 14485
2024年 07月 01日 星期一 23:19:09 CST: tts.py 运行失败,退出状态码 1, PID: 14502
2024年 07月 01日 星期一 23:19:11 CST: wenzhi.py 被停止, PID: 14515
2024年 07月 01日 星期一 23:19:11 CST: tts.py 运行失败,退出状态码 1, PID: 14532
2024年 07月 01日 星期一 23:19:13 CST: wenzhi.py 被停止, PID: 14536
2024年 07月 01日 星期一 23:19:13 CST: tts.py 运行失败,退出状态码 1, PID: 14544
2024年 07月 01日 星期一 23:19:15 CST: wenzhi.py 被停止, PID: 14548
2024年 07月 01日 星期一 23:19:15 CST: tts.py 运行失败,退出状态码 1, PID: 14574
2024年 07月 01日 星期一 23:19:17 CST: wenzhi.py 被停止, PID: 14578
2024年 07月 01日 星期一 23:19:17 CST: tts.py 运行失败,退出状态码 1, PID: 14586
2024年 07月 01日 星期一 23:19:19 CST: wenzhi.py 被停止, PID: 14599
2024年 07月 01日 星期一 23:19:19 CST: tts.py 运行失败,退出状态码 1, PID: 14616
2024年 07月 01日 星期一 23:19:21 CST: wenzhi.py 被停止, PID: 14620
2024年 07月 01日 星期一 23:19:21 CST: tts.py 运行失败,退出状态码 1, PID: 14646
2024年 07月 01日 星期一 23:19:23 CST: wenzhi.py 被停止, PID: 14650
2024年 07月 01日 星期一 23:19:23 CST: tts.py 运行失败,退出状态码 1, PID: 14658
2024年 07月 01日 星期一 23:19:25 CST: wenzhi.py 被停止, PID: 14671
2024年 07月 01日 星期一 23:19:25 CST: tts.py 运行失败,退出状态码 1, PID: 14679
2024年 07月 01日 星期一 23:19:27 CST: wenzhi.py 被停止, PID: 14692
2024年 07月 01日 星期一 23:19:27 CST: tts.py 运行失败,退出状态码 1, PID: 14709
2024年 07月 01日 星期一 23:19:29 CST: wenzhi.py 被停止, PID: 14713
2024年 07月 01日 星期一 23:19:29 CST: tts.py 运行失败,退出状态码 1, PID: 14721
2024年 07月 01日 星期一 23:19:31 CST: wenzhi.py 被停止, PID: 14735
2024年 07月 01日 星期一 23:19:31 CST: tts.py 运行失败,退出状态码 1, PID: 14743
2024年 07月 01日 星期一 23:19:33 CST: wenzhi.py 被停止, PID: 14756
2024年 07月 01日 星期一 23:19:33 CST: tts.py 运行失败,退出状态码 1, PID: 14773
2024年 07月 01日 星期一 23:19:35 CST: wenzhi.py 被停止, PID: 14777
2024年 07月 01日 星期一 23:19:35 CST: tts.py 运行失败,退出状态码 1, PID: 14785
2024年 07月 01日 星期一 23:19:37 CST: wenzhi.py 被停止, PID: 14798
2024年 07月 01日 星期一 23:19:37 CST: tts.py 运行失败,退出状态码 1, PID: 14815
2024年 07月 01日 星期一 23:19:39 CST: wenzhi.py 被停止, PID: 14819
2024年 07月 01日 星期一 23:19:39 CST: tts.py 运行失败,退出状态码 1, PID: 14836
2024年 07月 01日 星期一 23:19:41 CST: wenzhi.py 被停止, PID: 14840
2024年 07月 01日 星期一 23:19:41 CST: tts.py 运行失败,退出状态码 1, PID: 14848
2024年 07月 01日 星期一 23:19:43 CST: wenzhi.py 被停止, PID: 14852
2024年 07月 01日 星期一 23:19:43 CST: tts.py 运行失败,退出状态码 1, PID: 14860
2024年 07月 01日 星期一 23:19:45 CST: wenzhi.py 被停止, PID: 14864
2024年 07月 01日 星期一 23:19:45 CST: tts.py 运行失败,退出状态码 1, PID: 14872
2024年 07月 01日 星期一 23:19:47 CST: wenzhi.py 被停止, PID: 14885
2024年 07月 01日 星期一 23:19:47 CST: tts.py 运行失败,退出状态码 1, PID: 14893
2024年 07月 01日 星期一 23:19:49 CST: wenzhi.py 被停止, PID: 14897
2024年 07月 01日 星期一 23:19:49 CST: tts.py 运行失败,退出状态码 1, PID: 14905
2024年 07月 01日 星期一 23:19:51 CST: wenzhi.py 被停止, PID: 14909
2024年 07月 01日 星期一 23:19:51 CST: tts.py 运行失败,退出状态码 1, PID: 14926
2024年 07月 01日 星期一 23:19:53 CST: wenzhi.py 被停止, PID: 14948
2024年 07月 01日 星期一 23:19:53 CST: tts.py 运行失败,退出状态码 1, PID: 14956
2024年 07月 01日 星期一 23:19:55 CST: wenzhi.py 被停止, PID: 14969
2024年 07月 01日 星期一 23:19:55 CST: tts.py 运行失败,退出状态码 1, PID: 14977
2024年 07月 01日 星期一 23:19:58 CST: wenzhi.py 被停止, PID: 14981
2024年 07月 01日 星期一 23:19:58 CST: tts.py 运行失败,退出状态码 1, PID: 15007
2024年 07月 01日 星期一 23:20:00 CST: wenzhi.py 被停止, PID: 15020
2024年 07月 01日 星期一 23:20:00 CST: tts.py 运行失败,退出状态码 1, PID: 15037
2024年 07月 01日 星期一 23:20:02 CST: wenzhi.py 被停止, PID: 15041
2024年 07月 01日 星期一 23:20:02 CST: tts.py 运行失败,退出状态码 1, PID: 15049
2024年 07月 01日 星期一 23:20:04 CST: wenzhi.py 被停止, PID: 15053
2024年 07月 01日 星期一 23:20:04 CST: tts.py 运行失败,退出状态码 1, PID: 15061
2024年 07月 01日 星期一 23:20:06 CST: wenzhi.py 被停止, PID: 15067
2024年 07月 01日 星期一 23:20:06 CST: tts.py 运行失败,退出状态码 1, PID: 15075
2024年 07月 01日 星期一 23:20:08 CST: wenzhi.py 被停止, PID: 15079
2024年 07月 01日 星期一 23:20:08 CST: tts.py 运行失败,退出状态码 1, PID: 15105

@ -0,0 +1,40 @@
#!/bin/bash
# repted.sh
while true; do
# 在后台运行itt目录中的wenzhi.py脚本并获取其PID
python3 itt/wenzhi.py &
wenzhi_pid=$!
# 等待一段时间让wenzhi.py开始执行
sleep 1
# 检查wenzhi.py是否还在运行
if kill -0 $wenzhi_pid 2>/dev/null; then
# wenzhi.py仍在运行尝试停止它
kill $wenzhi_pid
wait $wenzhi_pid
echo "$(date): wenzhi.py 被停止, PID: $wenzhi_pid" >> log.txt
else
# wenzhi.py已经停止记录日志
echo "$(date): wenzhi.py 已经停止, PID: $wenzhi_pid" >> log.txt
fi
# 在后台运行tts目录中的tts.py脚本并获取其PID
python3 tts/tts.py &
tts_pid=$!
# 等待tts.py脚本结束以获取其退出状态
wait $tts_pid
tts_exit_status=$?
# 检查tts.py脚本是否成功运行
if [ $tts_exit_status -eq 0 ]; then
echo "$(date): tts.py 运行成功, PID: $tts_pid" >> log.txt
else
echo "$(date): tts.py 运行失败,退出状态码 $tts_exit_status, PID: $tts_pid" >> log.txt
fi
# 等待一段时间后再重复这个过程
sleep 1
done

@ -0,0 +1,69 @@
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder
from keras.models import Sequential
from keras.layers import Dense, Embedding, Flatten
from keras.preprocessing.sequence import pad_sequences
from keras.utils import to_categorical
import re
# 文本清洗函数
def clean_text(text):
# 使用正则表达式去除非标准字符
cleaned_text = re.sub(r'[^a-zA-Z0-9\s]', '', text.lower())
return cleaned_text
# 读取文本文件
def read_text_file(file_path):
with open(file_path, 'r', encoding='utf-8') as f:
return f.read()
# 数据准备
def prepare_data(text, max_sequence_length=10):
words = set(clean_text(text).split())
word2index = {word: i for i, word in enumerate(words, 1)}
index2word = {i: word for word, i in word2index.items()}
sentences = text.split('. ')
X, y = [], []
for sentence in sentences:
cleaned_sentence = clean_text(sentence)
tokens = [word2index[word] for word in cleaned_sentence.split() if word in word2index]
for i in range(1, len(tokens)):
X.append(tokens[:-i])
y.append(tokens[i])
# 使用pad_sequences处理变长序列
X = pad_sequences(X, maxlen=max_sequence_length, padding='pre', truncating='post')
y = to_categorical(np.array(y), num_classes=len(word2index))
return X, y, word2index, index2word
# 读取文本文件并准备数据
text = read_text_file('input.txt') # 假设输入文件名为input.txt
X, y, word2index, index2word = prepare_data(text)
# 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 构建NNLM模型这里使用Embedding层来捕获词汇之间的相似性
model = Sequential()
model.add(Embedding(input_dim=len(word2index) + 1, output_dim=32, input_length=max_sequence_length))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(len(word2index), activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy')
# 训练模型
model.fit(np.array([x for x in X_train]), y_train, epochs=10, batch_size=32)
# 预测
predictions = model.predict([x for x in X_test])
predicted_words = [np.argmax(pred) for pred in predictions]
predicted_sentences = [' '.join(index2word[word] for word in [sentence[0]] + [predicted_words[i] for i, _ in enumerate(sentence[1:]) if i < len(predicted_words)])
for sentence in X_test]
# 将预测结果写入文本文件
with open('output.txt', 'w', encoding='utf-8') as f:
for sentence in predicted_sentences:
f.write(sentence + '\n')

@ -0,0 +1,76 @@
import subprocess
import os
import datetime
from datetime import timedelta
import re
def speak_text_from_file(file_path, voice='zh', speed=150, pitch=50, output_file=None):
"""
从文件中读取文本并使用espeak将其转换为语音
:param file_path: 文本文件的路径
:param voice: 使用的声音例如 'zh' 用于中文
:param speed: 语速默认为 150
:param pitch: 音调默认为 50
:param output_file: 如果指定将语音输出保存为WAV文件
"""
with open(file_path, 'r', encoding='utf-8') as file:
text = file.read()
# 构建espeak命令
cmd = ['espeak', '-v', voice, '-s', str(speed), '-p', str(pitch)]
if output_file:
cmd.extend(['-w', output_file])
else:
# 如果没有指定输出文件,则直接播放语音
pass # 这里可以添加其他选项,如音量调整等
# 使用stdin将文本传递给espeak
try:
proc = subprocess.Popen(cmd, stdin=subprocess.PIPE, stderr=subprocess.PIPE, encoding='utf-8')
proc.communicate(input=text)
if proc.returncode != 0:
# 获取并打印错误信息
error_message = proc.stderr.read()
print(f"Error executing espeak: {error_message}")
except Exception as e:
print(f"An error occurred: {e}")
# 获取当前文件夹路径
current_dir = os.getcwd()
# 获取父文件夹路径
parent_dir = os.path.dirname(current_dir)
specific_folder_name = "txt"
specific_folder_path = os.path.join(current_dir, specific_folder_name)
# 列出当前文件夹中的所有txt文件
txt_files = [f for f in os.listdir(specific_folder_path) if f.endswith('.txt')]
# 提取文件名中的数字,并找出最大的数字
max_number = -1
max_file = None
for txt_file in txt_files:
# 使用正则表达式提取文件名中的数字
match = re.search(r'^(\d+)\.txt$', txt_file)
if match:
file_number = int(match.group(1))
if file_number > max_number:
max_number = file_number
max_file = txt_file
# 检查是否找到了数字命名最大的txt文件
if max_file:
file_path = os.path.join(current_dir, max_file)
# 调用函数来朗读文本文件
speak_text_from_file(file_path, voice='zh', speed=140, pitch=55)
# 如果你想要将语音保存为WAV文件可以这样做
output_wav_file = 'output/output.wav' # WAV文件的输出路径
speak_text_from_file(file_path, voice='zh', speed=140, pitch=55, output_file=output_wav_file)
else:
print("没有找到数字命名最大的txt文件。")
print(specific_folder_path)

Some files were not shown because too many files have changed in this diff Show More

Loading…
Cancel
Save